On Mon, 09 Sep 2013 11:05:44 -0600, Michael Torrie wrote: > On 09/09/2013 08:28 AM, wxjmfa...@gmail.com wrote: >> Comment: Such differences never happen with utf. > > But with utf, slicing strings is O(n) (well that's a simplification as > someone showed an algorithm that is log n), whereas a fixed-width > encoding (Latin-1, UCS-2, UCS-4) is O(1).
UTF-32 is fixed-width. UTF-16 is not, but if you limit yourself to only characters in the Basic Multilingual Plane, it is functionally equivalent to UCS-2 and therefore fixed-width. > Do you understand what this means? Talking about "utf" in general as JMF does is a good sign that he doesn't. Which UTF? I know of at least eight: UTF-1 UTF-7 UTF-8 UTF-9 # this one is a joke, but it does work UTF-16 # in two varieties, big-endian and little-endian UTF-18 # another joke UTF-32 # likewise two varieties UTF-EBCDIC although only 3 (perhaps 4, if you include UTF-7) are in common use. [...] > I don't even know that much about unicode yet it's clear you're either > deliberately muddying the waters with your stupid and pointless > arguments against FCS or you don't really understand the difference > between unicode and byte encoding. Which is it? I have been watching JMF get a mad-on about the flexible string representation since he first noticed it, and in my opinion, his complaints are based entirely on resentment that ASCII users save more memory than non-ASCII users. Even if it means everyone is worse off, he is utterly opposed to giving ASCII users any benefit. Of course, he neglects to consider that *every single Python user* is an ASCII user, since most strings in Python are pure ASCII. Names of builtins, standard library modules, variables, attributes, most of them are ASCII. -- Steven -- https://mail.python.org/mailman/listinfo/python-list