On 28 mar, 16:14, jmfauth <wxjmfa...@gmail.com> wrote: > On 28 mar, 15:38, Chris Angelico <ros...@gmail.com> wrote: > > > > > > > > > > > On Fri, Mar 29, 2013 at 1:12 AM, jmfauth <wxjmfa...@gmail.com> wrote: > > > This flexible string representation is so absurd that not only > > > "it" does not know you can not write Western European Languages > > > with latin-1, "it" penalizes you by just attempting to optimize > > > latin-1. Shown in my multiple examples. > > > PEP393 strings have two optimizations, or kinda three: > > > 1a) ASCII-only strings > > 1b) Latin1-only strings > > 2) BMP-only strings > > 3) Everything else > > > Options 1a and 1b are almost identical - I'm not sure what the detail > > is, but there's something flagging those strings that fit inside seven > > bits. (Something to do with optimizing encodings later?) Both are > > optimized down to a single byte per character. > > > Option 2 is optimized to two bytes per character. > > > Option 3 is stored in UTF-32. > > > Once again, jmf, you are forgetting that option 2 is a safe and > > bug-free optimization. > > > ChrisA > > As long as you are attempting to devide a set of characters in > chunks and try to handle them seperately, it will never work. > > Read my previous post about the unicode transformation format. > I know what pep393 does. > > jmf
Addendum. This was you correctly percieved in one another thread. You qualified it as a "switch". Now you have to understand from where this "switch" is coming from. jmf by toy with -- http://mail.python.org/mailman/listinfo/python-list