Schanzenbach, Martin schreef op do 10-02-2022 om 22:34 [+0000]: > While I understand the problem GNS defines strings to be UTF-8 > (notwithstanding punycode exceptions). > You can't have UTF-8 strings with a zero terminator without having it > mean exactly that: A string termination. > > Yes, you can say "but what if it is not a UTF-8 string", but that is > not really the problem of the GNS spec. > It normatively defines it as such and the implementation must comply > (with UTF-8). > See also https://en.wikipedia.org/wiki/Null-terminated_string section > in "Character encoding".
I thought that UTF-8 supports encoding \0 characters. For example Guile silently encodes \0 and decodes it again: $ ((@ (rnrs bytevectors) utf8->string) ((@ (rnrs bytevectors) string->utf8) "foo\x00bar")) > "foo\x00bar" and Guile claims it is UTF-8: Return a newly allocated bytevector that contains the UTF-8, [...] or UTF-32 [...] encoding of STR. For UTF-16 [...]. I guess I'll have to submit documentation patches to Guile and perhaps even the RnRS. Greetings, Maxime.
signature.asc
Description: This is a digitally signed message part