On 1/31/2014 10:36 PM, Chris Angelico wrote:
On Sat, Feb 1, 2014 at 1:54 PM, MRAB <pyt...@mrabarnett.plus.com> wrote:
I think that some years ago I heard about a variation on UTF-8
(Microsoft?) where codepoint U+0000 is encoded as 0xC0 0x80 so that the
null byte can be used as the string terminator.

I had a look on Wikipedia found this:

http://en.wikipedia.org/wiki/Null-terminated_string

Yeah, it's a common abuse of UTF-8. It's a violation of spec, but an
understandable one. However, I don't understand why the first part -
why should \0 become U+0000 but (presumably) the \a later on
(...cs\accel...) doesn't become U+0007, etc?

Because only \0 has a special meaning in a C string, and Tk is written in C and uses C strings.

--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to