On Sat, Feb 1, 2014 at 1:54 PM, MRAB <pyt...@mrabarnett.plus.com> wrote: > I think that some years ago I heard about a variation on UTF-8 > (Microsoft?) where codepoint U+0000 is encoded as 0xC0 0x80 so that the > null byte can be used as the string terminator. > > I had a look on Wikipedia found this: > > http://en.wikipedia.org/wiki/Null-terminated_string
Yeah, it's a common abuse of UTF-8. It's a violation of spec, but an understandable one. However, I don't understand why the first part - why should \0 become U+0000 but (presumably) the \a later on (...cs\accel...) doesn't become U+0007, etc? ChrisA -- https://mail.python.org/mailman/listinfo/python-list