On Sun, Mar 28, 2010 at 7:36 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
> Andrew Dunstan <and...@dunslane.net> writes:
>> Here's another thought. Given that JSON is actually specified to consist
>> of a string of Unicode characters, what will we deliver to the client
>> where the client encoding is, say Latin1? Will it actually be a legal
>> JSON byte stream?
>
> No, it won't.  We will *not* be sending anything but latin1 in such a
> situation, and I really couldn't care less what the JSON spec says about
> it.  Delivering wrongly-encoded data to a client is a good recipe for
> all sorts of problems, since the client-side code is very unlikely to be
> expecting that.  A datatype doesn't get to make up its own mind whether
> to obey those rules.  Likewise, data on input had better match
> client_encoding, because it's otherwise going to fail the encoding
> checks long before a json datatype could have any say in the matter.
>
> While I've not read the spec, I wonder exactly what "consist of a string
> of Unicode characters" should actually be taken to mean.  Perhaps it
> only means that all the characters must be members of the Unicode set,
> not that the string can never be represented in any other encoding.
> There's more than one Unicode encoding anyway...

See sections 2.5 and 3 of:

http://www.ietf.org/rfc/rfc4627.txt?number=4627

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to