unicode - utf-32 advantage explanation -
in online diveintopython3 book,it says advantage of utf-32 , utf-16
utf-32 straightforward encoding; takes each unicode character (a 4-byte number) , represents character same number. has advantages, important being can find nth character of string in constant time, because nth character starts @ 4×nth byte
can explain this? if possible example..i not sure have quite understood it
the usual encoding of unicode utf-8; utf-8 represents characters variable number of bytes. instance, “l” character encoded single byte (0x4c) while “é” encoded 2 bytes (0xc3, 0xa9). in utf-8 encoding, word “lézard” takes 7 bytes, , cannot nth character without decoding characters before (you don't know how many bytes each character needs).
in utf-32, all characters use 4 bytes, nth character, need go byte 4×(n-1). first character @ position 0, second @ position 4, third @ position 8, etc.
Comments
Post a Comment