Re: [HACKERS] xml type and encodings

2007-01-16 Thread Martijn van Oosterhout
On Tue, Jan 16, 2007 at 06:41:56PM +0100, Florian G. Pflug wrote: > If you do that, maybe it would be the easiest and least confusing thing > to just _always_ represent an xml document in utf-8, ignoring the > client_encoding > entirely for xml. You can't do that. The server needs to parse the i

Re: [HACKERS] xml type and encodings

2007-01-16 Thread Florian G. Pflug
Peter Eisentraut wrote: I wrote: We need to decide on how to handle encoding information embedded in xml data that is passed through the client/server encoding conversion. Tangentially related, I'm currently experimenting with a setup that stores all xml data in UTF-8 on the server, convertin

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Andrew Dunstan wrote: >> Are we going to ensure that what we hand back to another client has >> an appropriate encding decl? Or will we just remove it in all cases? > We can't do the former, but the latter might be doable. I think that in the case of

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut
I wrote: > We need to decide on how to handle encoding information embedded in > xml data that is passed through the client/server encoding > conversion. Tangentially related, I'm currently experimenting with a setup that stores all xml data in UTF-8 on the server, converting it back to the serv

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut
Andrew Dunstan wrote: > We should error out on any explicit encoding that conflicts with the > client encoding. I don't like the idea of just ignoring an explicit > encoding decl. That is an instance of the problem of figuring out which encoding names are equivalent, which I believe we have settl

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Andrew Dunstan
Peter Eisentraut wrote: > Florian G. Pflug wrote: >> Couldn't the server change the encoding declaration inside the xml to >> the correct >> one (the same as client_encoding) before returning the result? > > The data type output function doesn't know what the client encoding is > or whether the dat

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut
Florian G. Pflug wrote: > Sorry, I don't get it - how does this work for text, then? It works > there to dynamically recode the data from the database encoding to > the client encoding before sending it off to the client, no? Sure, but it doesn't change the text inside the datum. -- Peter Eisentr

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Florian G. Pflug
Peter Eisentraut wrote: Florian G. Pflug wrote: Couldn't the server change the encoding declaration inside the xml to the correct one (the same as client_encoding) before returning the result? The data type output function doesn't know what the client encoding is or whether the data will be s

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut
Florian G. Pflug wrote: > Couldn't the server change the encoding declaration inside the xml to > the correct > one (the same as client_encoding) before returning the result? The data type output function doesn't know what the client encoding is or whether the data will be shipped to the client a

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut
Martijn van Oosterhout wrote: > The only real alternative is to treat xml more like bytea than text > (ie, treat the input as a stream of octets). bytea isn't "treated" any different than other data types. You just have to take care in the client that you escape every byte greater than 127. Th

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Florian G. Pflug
Peter Eisentraut wrote: Am Montag, 15. Januar 2007 17:33 schrieb Florian G. Pflug: Would this mean that if the client_encoding is for example latin1, and I retrieve an xml document uploaded by a client with client_encoding utf-8 (and thus having encoding="c" in the xml tag), that I would get a d

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Martijn van Oosterhout
On Mon, Jan 15, 2007 at 05:47:37PM +0100, Peter Eisentraut wrote: > Am Montag, 15. Januar 2007 17:33 schrieb Florian G. Pflug: > > Would this mean that if the client_encoding is for example latin1, and I > > retrieve an xml document uploaded by a client with client_encoding utf-8 > > (and thus havi

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut
Am Montag, 15. Januar 2007 17:33 schrieb Florian G. Pflug: > Would this mean that if the client_encoding is for example latin1, and I > retrieve an xml document uploaded by a client with client_encoding utf-8 > (and thus having encoding="c" in the xml tag), that I would get a > document with latin1

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Florian G. Pflug
Peter Eisentraut wrote: Am Montag, 15. Januar 2007 12:42 schrieb Nikolay Samokhvalov: On 1/15/07, Peter Eisentraut <[EMAIL PROTECTED]> wrote: Client encoding is A, server encoding is B. Client sends an xml datum that looks like this: INSERT INTO table VALUES (xmlparse(document '...')); Assum

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Peter Eisentraut
Am Montag, 15. Januar 2007 12:42 schrieb Nikolay Samokhvalov: > On 1/15/07, Peter Eisentraut <[EMAIL PROTECTED]> wrote: > > Client encoding is A, server encoding is B. Client sends an xml datum > > that looks like this: > > > > INSERT INTO table VALUES (xmlparse(document ' > encoding="C"?>...'));

Re: [HACKERS] xml type and encodings

2007-01-15 Thread Nikolay Samokhvalov
On 1/15/07, Peter Eisentraut <[EMAIL PROTECTED]> wrote: Client encoding is A, server encoding is B. Client sends an xml datum that looks like this: INSERT INTO table VALUES (xmlparse(document '...')); Assuming that A, B, and C are all distinct, this could fail at a number of places. I sugges

Re: [HACKERS] xml type and encodings

2007-01-14 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes: > Looking at the send/receive functions for the text type, they > communicate all data in the server encoding, so it seems reasonable to > do this here as well. Uh, no, I'm pretty sure there's a translation to the client encoding. It's in a subroutine

[HACKERS] xml type and encodings

2007-01-14 Thread Peter Eisentraut
We need to decide on how to handle encoding information embedded in xml data that is passed through the client/server encoding conversion. Here is an example: Client encoding is A, server encoding is B. Client sends an xml datum that looks like this: INSERT INTO table VALUES (xmlparse(documen