On Tue, Feb 28, 2017 at 9:05 AM, Fraser Hanson <fraser.han...@gmail.com> wrote: > https://play.golang.org/p/05wZM9BhfB > > I'm working on some code that reads UTF32 and converts it to go strings. > I'm finding some surprising behavior when casting slices of runes to > strings. > > runes := []rune{'©'} > fmt.Printf(" cast to string: (%s)\n", string(runes)) > fmt.Printf("bytes in string: (%x)\n", string(runes)) > Output: > > cast to string: (©) > bytes in string: (c2a9) // <-- where's the C2 byte coming from?? > > > The weird part is that casting the rune slice to a string causes it to pick > up an additional leading character. > > runesi 0x00-0x7f get nothing prepended. > runes 0x80-0xbf gets a leading c2 byte as seen above. > runes 0xc0-0xff gets a leading c3 byte. > rune 0x100 gets a leading c4 byte. Seems like a pattern here. > > The same thing happens if I add the runes into a bytes.Buffer with > WriteRune(), then print it out with bytes.Buffer.String(). > > Can anyone explain this? > What's the correct way to convert a slice of runes into a string?
When you convert []rune to string, the runes are encoded into UTF-8 and the resulting bytes are the contents of the string. That is what you are seeing. I don't know what you expect to see. Ian -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.