On Sat, Jan 31, 2015 at 8:25 PM, Tom Lane <t...@sss.pgh.pa.us> wrote: > Robert Haas <robertmh...@gmail.com> writes: >> I understand Andrew to be saying that if you take a 6-character string >> and convert it to a JSON string and then back to text, you will >> *usually* get back the same 6 characters you started with ... unless >> the first character was \, the second u, and the remainder hexadecimal >> digits. Then you'll get back a one-character string or an error >> instead. It's not hard to imagine that leading to surprising >> behavior, or even security vulnerabilities in applications that aren't >> expecting such a translation to happen under them. > > That *was* the case, with the now-reverted patch that changed the escaping > rules. It's not anymore: > > regression=# select to_json('\u1234'::text); > to_json > ----------- > "\\u1234" > (1 row) > > When you convert that back to text, you'll get \u1234, no more and no > less. For example: > > regression=# select array_to_json(array['\u1234'::text]); > array_to_json > --------------- > ["\\u1234"] > (1 row) > > regression=# select array_to_json(array['\u1234'::text])->0; > ?column? > ----------- > "\\u1234" > (1 row) > > regression=# select array_to_json(array['\u1234'::text])->>0; > ?column? > ---------- > \u1234 > (1 row) > > Now, if you put in '"\u1234"'::jsonb and extract that string as text, > you get some Unicode character or other. But I'd say that a JSON user > who is surprised by that doesn't understand JSON, and definitely that they > hadn't read more than about one paragraph of our description of the JSON > types.
Totally agree. That's why I think reverting the patch was the right thing to do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers