On Fri, Mar 29, 2013 at 1:12 AM, jmfauth <wxjmfa...@gmail.com> wrote: > This flexible string representation is so absurd that not only > "it" does not know you can not write Western European Languages > with latin-1, "it" penalizes you by just attempting to optimize > latin-1. Shown in my multiple examples.
PEP393 strings have two optimizations, or kinda three: 1a) ASCII-only strings 1b) Latin1-only strings 2) BMP-only strings 3) Everything else Options 1a and 1b are almost identical - I'm not sure what the detail is, but there's something flagging those strings that fit inside seven bits. (Something to do with optimizing encodings later?) Both are optimized down to a single byte per character. Option 2 is optimized to two bytes per character. Option 3 is stored in UTF-32. Once again, jmf, you are forgetting that option 2 is a safe and bug-free optimization. ChrisA -- http://mail.python.org/mailman/listinfo/python-list