On Sat, Feb 1, 2014 at 4:46 PM, Terry Reedy <tjre...@udel.edu> wrote: > On 1/31/2014 10:36 PM, Chris Angelico wrote: >> >> On Sat, Feb 1, 2014 at 1:54 PM, MRAB <pyt...@mrabarnett.plus.com> wrote: >>> >>> I think that some years ago I heard about a variation on UTF-8 >>> (Microsoft?) where codepoint U+0000 is encoded as 0xC0 0x80 so that the >>> null byte can be used as the string terminator. >>> >>> I had a look on Wikipedia found this: >>> >>> http://en.wikipedia.org/wiki/Null-terminated_string >> >> >> Yeah, it's a common abuse of UTF-8. It's a violation of spec, but an >> understandable one. However, I don't understand why the first part - >> why should \0 become U+0000 but (presumably) the \a later on >> (...cs\accel...) doesn't become U+0007, etc? > > > Because only \0 has a special meaning in a C string, and Tk is written in C > and uses C strings.
Eh? I've used \a in C programs (not often but I have used it). It's possible that \0 is the only one that actually bombs anything (because of C0 80 representation). But since \7 and \a both represent 0x07 in a C string, I would expect there to be other problems, if it's interpreting it as source. Ah well! Weird weird. ChrisA -- https://mail.python.org/mailman/listinfo/python-list