I've been going over the spec to clarify finer points of how string vs. []byte behave, I think there may be an unnecessary degree of freedom that could be removed. Either that, or I missed a load-bearing statement that constrains implementations.
In https://go.dev/ref/spec#Conversions, `[]rune(str)` is specified as: "Converting a value of a string type to a slice of runes type yields a slice containing the individual Unicode code points of the string." This does not specify the behavior if the string contains invalid UTF-8 byte sequences. If my reading is correct, a compliant implementation would be free to panic() on such a conversion, or implement the conversion in an arbitrary way of its choosing. This is in contrast to for...range over a string, which strictly specifies how invalid UTF-8 byte sequences are handled. https://go.dev/ref/spec#For_statements says: "For a string value [...] If the iteration encounters an invalid UTF-8 sequence, the second value will be `0xFFFD`, the Unicode replacement character, and the next iteration will advance a single byte in the string." This is in line with current Unicode recommendations for input processing, and (IMO) is the only reasonable thing to do when decoding invalid UTF-8. Empirically, the reference Go compiler does the sensible thing: string to []rune conversions behave consistently with the ranged-for behavior. I haven't checked but presume that gccgo et al. do the same: they must implement the ranged for-behavior anyway, doing something different for []rune conversion would be more work to introduce gratuitous surprising behavior. But, unless I missed a clarification in the spec, a contrarian implementation _could_ implement novel behavior for []rune conversion of invalid UTF-8. Did I miss anything? If not I'll file a proposal to spell out required behavior in the spec, since I don't think there are any compatibility concerns or reasonable arguments for allowing []rune conversion alone to behave strangely in this respect. - Dave -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/a81552b5-b382-4da8-ab8d-a4d4d657cfdd%40app.fastmail.com.