On Wed, 29 Aug 2012 08:43:05 -0700, wxjmfauth wrote: > I can hit the nail a little more. > I have even a better idea and I'm serious. > > If "Python" has found a new way to cover the set of the Unicode > characters, why not proposing it to the Unicode consortium?
Because the implementation of the str datatype in a programming language has nothing to do with the Unicode consortium. You might as well propose it to the International Union of Railway Engineers. > Unicode has already three schemes covering practically all cases: memory > consumption, maximum flexibility and an intermediate solution. And Python's solution uses those: UCS-2, UCS-4, and UTF-8. The only thing which is innovative here is that instead of the Python compiler declaring that "all strings will be stored in UCS-2", the compiler chooses an implementation for each string as needed. So some strings will be stored internally as UCS-4, some as UCS-2, and some as ASCII (which is a standard, but not the Unicode consortium's standard). (And possibly some as UTF-8? I'm not entirely sure from reading the PEP.) There's nothing radical here, honest. -- Steven -- http://mail.python.org/mailman/listinfo/python-list