On Sun, 26 Aug 2012 09:40:13 -0600, Ian Kelly wrote: > I think the documentation for those functions is simply badly worded. > The "width in bytes" it returns is not the width of the rune (which as > jmf notes is simply an alias for int32 that stores a single code point).
Is this documented somewhere? I can't tell you how long I spent unsuccessfully googling for variations on "go language runes", which unsurprisingly mostly came back with pages about Germanic runes and elf runes but not Go runes. I read the golang FAQs, which mentioned Unicode *once* and runes not at all. Obviously Go language programmers don't care much about Unicode. > It means the UTF-8 width of the character, i.e. the number of UTF-8 > bytes the function "consumed", presumably so that the caller can then > reslice the data with that many bytes fewer. That makes sense, given the lousy string implementation and API they're working with. I note that not all 32-bit ints are valid code points. I suppose I can see sense in having rune be a 32-bit integer value limited to those valid code points. (But, dammit, why not call it a code point?) But if rune is merely an alias for int32, why not just call it int32? -- Steven -- http://mail.python.org/mailman/listinfo/python-list