On Sun, Oct 22, 2017, at 09:29, Juliusz Chroboczek wrote:
> I'm probably missing something obvious, but I've looked through the
> standard library to no avail. How do I sanitise a []byte to make sure
> it's a UTF-8 string by replacing all incorrect sequences by the
> replacement character (or what
Converting a string to a slice of runes gives you the individual code points,
with the replacement character as necessary. Converting a slice of runes into a
string gives you the UTF-8 representation. So sanitation of a string should be
as simple as string([]rune(someString)). This will be O(n)
See the section "For statements with range clause" in the spec:
https://golang.org/ref/spec#For_statements
"For a string value, the "range" clause iterates over the Unicode code
points in the string starting at byte index 0. On successive
iterations, the index value will be the index of the first
I'm probably missing something obvious, but I've looked through the
standard library to no avail. How do I sanitise a []byte to make sure
it's a UTF-8 string by replacing all incorrect sequences by the
replacement character (or whatever)?
I've found unicode/utf8.Valid, which tells me if a []byte