Robert Kern wrote: > http://www.joelonsoftware.com/articles/Unicode.html
That was fascinating. Thank you. So as it turns out, Unicode and UTF-8 are not the same thing? Am I right to say that UTF-8 stores the first 128 Unicode code points in a single byte, and then stores higher code points in however many bytes they may need? If so, I guess I had been mislead by the '8' in the name, thinking that UTF-8 was another way of storing characters in one byte (which would make it no different than Latin-1, I suppose). -- http://mail.python.org/mailman/listinfo/python-list