Andrew Dunstan <and...@dunslane.net> writes: > On 01/27/2015 12:23 PM, Tom Lane wrote: >> I think coding anything is premature until we decide how we're going to >> deal with the fundamental ambiguity.
> The input \\uabcd will be stored correctly as \uabcd, but this will in > turn be rendered as \uabcd, whereas it should be rendered as \\uabcd. > That's what the patch fixes. > There are two problems here and this addresses one of them. The other > problem is the ambiguity regarding \\u0000 and \u0000. It's the same problem really, and until we have an answer about what to do with \u0000, I think any patch is half-baked and possibly counterproductive. In particular, I would like to suggest that the current representation of \u0000 is fundamentally broken and that we have to change it, not try to band-aid around it. This will mean an on-disk incompatibility for jsonb data containing U+0000, but hopefully there is very little of that out there yet. If we can get a fix into 9.4.1, I think it's reasonable to consider such solutions. The most obvious way to store such data unambiguously is to just go ahead and store U+0000 as a NUL byte (\000). The only problem with that is that then such a string cannot be considered to be a valid value of type TEXT, which would mean that we'd need to throw an error if we were asked to convert a JSON field containing such a character to text. I don't particularly have a problem with that, except possibly for the time cost of checking for \000 before allowing a conversion to occur. While a memchr() check might be cheap enough, we could also consider inventing a new JEntry type code for string-containing-null, so that there's a distinction in the type system between strings that are coercible to text and those that are not. If we went down a path like that, the currently proposed patch would be quite useless. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers