Le lundi 8 juillet 2013 19:52:17 UTC+2, Chris Angelico a écrit : > On Tue, Jul 9, 2013 at 3:31 AM, <ferdy.blat...@gmail.com> wrote: > > > Unfortunately (as probably I told you before) I will never pass to > > > Python 3... Guido should not always listen only to gurus like him... > > > I don't like Python as before...starting from OOP and ending with codecs > > > like utf-8. Regarding OOP, much appreciated expecially by experts, he > > > could use python 2 for hiding the complexities of OOP (improving, as an > > > effect, object's code hiding) moving classes and objects to > > > imported methods, leaving in this way the programming style to the > > > well known old style: sequential programming and functions. > > > About utf-8... the same solution: keep utf-8 but for the non experts, add > > > methods to convert to solutions which use the range 128-255 of only one > > > byte (I do not give a damn about chinese and "similia"!...) > > > I know that is a lost battle (in italian "una battaglia persa")! > > > > Well, there won't be a Python 2.8, so you really should consider > > moving at some point. Python 3.3 is already way better than 2.7 in > > many ways, 3.4 will improve on 3.3, and the future is pretty clear. > > But nobody's forcing you, and 2.7.x will continue to get > > bugfix/security releases for a while. (Personally, I'd be happy if > > everyone moved off the 2.3/2.4 releases. It's not too hard supporting > > 2.6+ or 2.7+.) > > > > The thing is, you're thinking about UTF-8, but you should be thinking > > about Unicode. I recommend you read these articles: > > > > http://www.joelonsoftware.com/articles/Unicode.html > > http://unspecified.wordpress.com/2012/04/19/the-importance-of-language-level-abstract-unicode-strings/ > > > > So long as you are thinking about different groups of characters as > > different, and wanting a solution that maps characters down into the > > <256 range, you will never be able to cleanly internationalize. With > > Python 3.3+, you can ignore the differences between ASCII, BMP, and > > SMP characters; they're all just "characters". Everything works > > perfectly with Unicode. >
----------- Just to stick with this funny character ẞ, a ucs-2 char in the Flexible String Representation nomenclature. It seems to me that, when one needs more than ten bytes to encode it, >>> sys.getsizeof('a') 26 >>> sys.getsizeof('ẞ') 40 this is far away from the perfection. BTW, for a modern language, is not ucs2 considered as obsolete since many, many years? jmf -- http://mail.python.org/mailman/listinfo/python-list