I do not find the thread, where a Python core dev spoke about French, so I'm putting here.
This stupid Flexible String Representation splits Unicode in chunks and one of these chunks is latin-1 (iso-8859-1). If we consider that latin-1 is unusable for 17 (seventeen) European languages based on the latin alphabet, one can not say Python is really well prepared. Most of the problems are coming from the extensive usage of diacritics in these languages. Thanks to the FSR again, working with normalized forms does not work very well. At least, there is some consistency. Now, if we consider that most of the new characters will be part of the BMP ("daily" used chars), it is hard to present Python as a modern language. It sticks more to the past and it not really prepared for the future, the acceptance of new chars like ẞ or the new Turkish lira sign ((U+20BA). >>> sys.getsizeof('š') 40 >>> sys.getsizeof('0') 26 14 bytes to encode a non-latin-1 char is not so bad. jmf -- http://mail.python.org/mailman/listinfo/python-list