Robert Haas <robertmh...@gmail.com> writes: > On Thu, Jan 29, 2015 at 4:33 PM, Andrew Dunstan <and...@dunslane.net> wrote: >> I'm coming down more and more on the side of Tom's suggestion just to ban >> \u0000 in jsonb.
> I have yet to understand what we fix by banning \u0000. How is 0000 > different from any other four-digit hexadecimal number that's not a > valid character in the current encoding? What does banning that one > particular value do? As Andrew pointed out upthread, it avoids having to answer the question of what to return for select (jsonb '["foo\u0000bar"]')->>0; or any other construct which is supposed to return an *unescaped* text representation of some JSON string value. Right now you get ?column? -------------- foo\u0000bar (1 row) Which is wrong IMO, first because it violates the premise that the output should be unescaped, and second because this output cannot be distinguished from the (correct) output of regression=# select (jsonb '["foo\\u0000bar"]')->>0; ?column? -------------- foo\u0000bar (1 row) There is no way to deliver an output that is not confusable with some other value's correct output, other than by emitting a genuine \0 byte which unfortunately we cannot support in a TEXT result. Potential solutions for this have been mooted upthread, but none of them look like they're something we can do in the very short run. So the proposal is to ban \u0000 until such time as we can do something sane with it. > In any case, whatever we do about that issue, the idea that the text > -> json string transformation can *change the input string into some > other string* seems like an independent problem. No, it's exactly the same problem, because the reason for that breakage is an ill-advised attempt to make it safe to include \u0000 in JSONB. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers