Le lundi 25 novembre 2013 16:11:22 UTC+1, Michael Torrie a écrit : > I only respond here, as unicode in general is an important concept that > > the OP will to make sure his students understand in Python, and I don't > > want you to dishonestly sow the seeds of uncertainty and doubt. > > > > On 11/25/2013 03:12 AM, wxjmfa...@gmail.com wrote: > > > Your paragraph is mixing different concepts. > > > > On the contrary, it appears you are the one mixing the concepts, and > > confusing a byte-encoding scheme with unicode. > > > > In an ideal world, the programmer should not need to know or care about > > what encoding scheme the language is using internally to store strings. > > And it does not matter whether the internal encoding scheme is endorsed > > by the unicode commission or not, provided it can handle all the valid > > unicode constructs. > > > > A string is unicode. Period. Hence you must concern yourself with > > encoding only when reading or writing a byte stream. > > > > Inside the language itself, the encoding is irrelevant. Ideally. In > > python 3.3+ anyway. Of course reality is different in other languages > > which is why programmers are used to worrying about things like exposing > > surrogate pairs (as Javascript does), or having to tweak your algorithms > > to deal with the fact that UTF-8 indexing is not O(1). To claim that a > > programmer has to concern himself with internal language encoding in > > Python 3 is not only untrue, it's ingenuousness at best, given the OP's > > mission. > > > > > When it comes to save memory, utf-8 is the choice. It > > > beats largely the FSR on the side of memory and on > > > the side of performances. > > > > So you would condemn everyone to use an O(n) encoding for a string when > > FSR offers full unicode compliance that optimizes both speed and memory? > > > > No, D'Aprano is correct. Python 3.3+ indeed does unicode right. It > > offers O(1) slicing, is memory efficient, and never exposes things like > > surrogate pairs. > > > > > How and why? I suggest, you have a deeper understanding > > > of unicode. > > > > Indeed I'd say D'Aprano does have a deeper understanding of unicode. > > > > > May I recall, it is one of the coding scheme endorsed > > > by "Unicode.org" and it is intensively used. This is not > > > by chance. > > > > Yes, you keep saying this. Have you encountered a real-world situation > > where you are impacted by Python's FSR? You keep posting silly > > benchmarks that prove nothing, and continue arguing, yet presumably you > > are still using Python. Why haven't you switched to Google Go or > > another language that implements unicode strings in UTF-8?
------ Everybody has the right to have an opinion. Understand I respect Steven's opinion. --- I'm aware of the utf-8 indexing "effect" (it is in fact the answer I expected), that's why I proposed to dive a little bit more in "unicode". Now something else. I'm practically no more programming in the sense creating applications, but mainly interested in unicode. I "toyed" with many tools, C#, go, ruby2 and my favorite, the TeX unicode engines. I just happen I have a large experience with Python and I'm finding this FSR fascinating. jmf -- https://mail.python.org/mailman/listinfo/python-list