On Thu, Oct 3, 2019 at 11:11 AM Poussier William <william.pouss...@gmail.com> wrote:
> Hello > > The encoding/json package escapes 0xA (line feed), 0xD (carriage return) > and 0x9 (horizontal tab) using the escape character '\'. However, when it > comes to 0x8 (backspace) and 0xc (form feed), it uses the Unicode escape > sequence staring with '\uXXXX'. > > Reproducer: https://play.golang.org/p/jihv9sZUjvY > > I can't really grasp the reason behind this difference for characters < > 0x20, even tho it is perfectly valid JSON, I expected to see \f and \b. > > Does anyone know the reason, if there is one that lead to this? > It looks like only a few of the RFC 8259 sec 7 special two-byte escapes are supported: https://github.com/golang/go/blob/go1.13.1/src/encoding/json/encode.go#L975-L994 Digging around the CLs linked from blame entries in that code-block, I found this comment from rsc@ <https://codereview.appspot.com/4678046#msg4> on the CL that added handling for \r and \n: > \r and \n is good. > let's leave \b and \f out. > no one cares about \f > and more people know \b as > word boundary than as backspace. > > Note that using two-letter substitutions are optional according to the RFC <https://tools.ietf.org/html/rfc8259#section-7>. (the relevant section): > Alternatively, there are two-character sequence escape > representations of some popular characters. So, for example, a > string containing only a single reverse solidus character *may be* > represented more compactly as "\\". > > To escape an extended character that is not in the Basic Multilingual > Plane, the character is represented as a 12-character sequence, > encoding the UTF-16 surrogate pair. So, for example, a string > containing only the G clef character (U+1D11E) may be represented as > "\uD834\uDD1E". > > string = quotation-mark *char quotation-mark > > char = unescaped / > escape ( > %x22 / ; " quotation mark U+0022 > %x5C / ; \ reverse solidus U+005C > %x2F / ; / solidus U+002F > %x62 / ; b backspace U+0008 > %x66 / ; f form feed U+000C > %x6E / ; n line feed U+000A > %x72 / ; r carriage return U+000D > %x74 / ; t tab U+0009 > %x75 4HEXDIG ) ; uXXXX U+XXXX > > escape = %x5C ; \ > > quotation-mark = %x22 ; " > > unescaped = %x20-21 / %x23-5B / %x5D-10FFFF > > On the other hand, it looks like on the decoding-side, the full complement are supported: https://github.com/golang/go/blob/b17fd8e49d24eb298c53de5cd0a8923f1e0270ba/src/encoding/json/decode.go#L1284-L1316 > > Thanks > > -- > You received this message because you are subscribed to the Google Groups > "golang-nuts" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to golang-nuts+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/golang-nuts/f3c65b8c-c612-4b75-852a-fda7b246a77e%40googlegroups.com > <https://groups.google.com/d/msgid/golang-nuts/f3c65b8c-c612-4b75-852a-fda7b246a77e%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CANrC0BgfkvkZJ-wxBtxWz%3DmvdUNh1%3DeQ%3Dq7PPFAxqA5K5m%3DAKQ%40mail.gmail.com.