Le lundi 18 novembre 2013 14:31:33 UTC+1, Steven D'Aprano a écrit : > > > ... choose one of the three bad choices: ... > > > > * choose UTF-16 or UTF-8, and have O(n) primitive string operations (like > > Haskell and, apparently, Ceylon); > > > > * or UTF-16 without support for the supplementary planes (which makes it > > virtually UCS-2), like Javascript; > > > > * choose UTF-32, and use two or four times as much memory as needed. > > >
Nothing can beat the coding schemes endorsed by Unicode. They are all working on the smallest possible entity level (Unicode Transformation *Units*) with a unique set of these entities. To not forget. A set of characters is an artificial construction and by nature it can not follow the logic of a more "natural" set, eg. integers. jmf -- https://mail.python.org/mailman/listinfo/python-list