On Mon, Jul 18, 2011 at 7:36 PM, Florian Pflug <f...@phlo.org> wrote: > On Jul19, 2011, at 00:17 , Joey Adams wrote: >> I suppose a simple solution would be to convert all escapes and >> outright ban escapes of characters not in the database encoding. > > +1. Making JSON work like TEXT when it comes to encoding issues > makes this all much simpler conceptually. It also avoids all kinds > of weird issues if you extract textual values from a JSON document > server-side.
Thanks for the input. I'm leaning in this direction too. However, it will be a tad tricky to implement the conversions efficiently, since the wchar API doesn't provide a fast path for individual codepoint conversion (that I'm aware of), and pg_do_encoding_conversion doesn't look like a good thing to call lots of times. My plan is to scan for escapes of non-ASCII characters, convert them to UTF-8, and put them in a comma-delimited string like this: a,b,c,d, then, convert the resulting string to the server encoding (which may fail, indicating that some codepoint(s) are not present in the database encoding). After that, read the string and plop the characters where they go. It's "clever", but I can't think of a better way to do it with the existing API. - Joey -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers