On Monday, 22 August 2016 11:51:59 UTC-4, JC wrote: > > Given: > > func main() { > r := 'a' > s := "a" > fmt.Println(r) > fmt.Println(s) > } > > My very possibly incorrect understanding is that the rune r holds a > Unicode code-point encoded with UTF-8 and stored as integer value, in this > case 97, as type int32. >
The rune r holds the Unicode code point 97 directly, without an encoding. An encoding is a mapping from code points to byte sequences. Since there are no bytes here, there is no encoding. A rune is just a number. > s is of type string, and contains a code-point encoded by UTF-8, but > which will be stored in a slice of type byte (which will use just one > uint8 in this case). Assuming that is all correct, then my questions are: > A string is an immutable finite sequence of bytes. This string contains the UTF-8 encoding of the single code point 'a', and that encoding consists of the single byte 97. Code points less than 128 are encoded using a single byte, and are thus compatible with ASCII. > 1. Will the rune / any rune always use 32 bits? > Yes. > 2. Why does printing r output the integer value, but printing s yield the > code-point itself? > Println formats all numbers in decimal by default, and rune is simply an alias for int32. Println prints a string by copying its bytes to stdout. In this case, it's just the single byte 97. The terminal prints this byte as 'a'. For more complex code points (values > 127), the UTF-8-encoded string would contain multiple bytes, and the terminal would decode these back into a single code point and display the appropriate glyph. -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.