On 28 mar, 11:30, Chris Angelico <ros...@gmail.com> wrote: > On Thu, Mar 28, 2013 at 8:03 PM, jmfauth <wxjmfa...@gmail.com> wrote:
----- > You really REALLY need to sort out in your head the difference between > correctness and performance. I still haven't seen one single piece of > evidence from you that Python 3.3 fails on any point of Unicode > correctness. That's because you are not understanding unicode. Unicode takes you from the character to the unicoded transformed fomat via the code point, working with a unique set of characters with a contigoous range of code points. Then it is up to the "implementors" (languages, compilers, ...) to implement this utf. > Covering the whole range of Unicode has never been a > problem. ... for all those, who are following the scheme explained above. And it magically works smoothly. Of course, there are some variations due to the Character Encoding Form wich is later influenced by the Character Encoding Scheme (the serialization of the character Encoding Scheme). Rough explanation in other words. I does not matter if you are using utf-8, -16, -32, ucs2 or ucs4. All the single characters are handled in the same way with the "same algorithm". --- The flexible string representation takes the problem from the other side, it attempts to work with the characters by using their representations and it (can only) fails... PS I never propose to use utf-8. I only spoke about utf-8 as an example. If you start to discuss indexing, you are off-topic. jmf -- http://mail.python.org/mailman/listinfo/python-list