On Sun, Mar 13, 2016 at 6:24 AM, Thomas 'PointedEars' Lahn <pointede...@web.de> wrote: > Marko Rauhamaa wrote: > >> […] HTML markup is all ASCII. > > Wrong. I am creating HTML documents whose source code contains Unicode > characters every day. > > Also, the two of you fail to differentiate between US-ASCII, a 7-bit > character encoding, and 8-bit or longer encodings which can *also* encode > characters that can be *encoded with* US-ASCII.
Where are the non-ASCII characters in your HTML documents? Are they in the *markup* of HTML, or in the *text*? This is the difference. And I'm not conflating those two. When I say ASCII, I am referring to the 128 characters that have Unicode codepoints U+0000 through U+007F. ChrisA -- https://mail.python.org/mailman/listinfo/python-list