Re: [HACKERS] XML with invalid chars

2011-05-11 Thread Andrew Dunstan
On 05/11/2011 07:00 PM, Noah Misch wrote: On Wed, May 11, 2011 at 06:17:07PM -0400, Andrew Dunstan wrote: On 05/09/2011 11:25 PM, Noah Misch wrote: SELECT xmlcomment(E'\ufffe'); That's a bit harder. Do we want to extend these checks to cover surrogates and end of plane characters, which are

Re: [HACKERS] XML with invalid chars

2011-05-11 Thread Noah Misch
On Wed, May 11, 2011 at 06:17:07PM -0400, Andrew Dunstan wrote: > On 05/09/2011 11:25 PM, Noah Misch wrote: >> SELECT xmlcomment(E'\ufffe'); > > That's a bit harder. Do we want to extend these checks to cover > surrogates and end of plane characters, which are the remaining > forbidden chars? I

Re: [HACKERS] XML with invalid chars

2011-05-11 Thread Andrew Dunstan
On 05/09/2011 11:25 PM, Noah Misch wrote: I see you've gone with doing it unconditionally. I'd lean toward testing the library in pg_xml_init and setting a flag indicating whether we need the extra pass. However, a later patch can always optimize that. I wasn't terribly keen on the idea,

Re: [HACKERS] XML with invalid chars

2011-05-09 Thread Noah Misch
On Sun, May 08, 2011 at 06:25:27PM -0400, Andrew Dunstan wrote: > On 04/27/2011 11:41 PM, Noah Misch wrote: >> On Wed, Apr 27, 2011 at 11:22:37PM -0400, Andrew Dunstan wrote: >>> On 04/27/2011 05:30 PM, Noah Misch wrote: To make things worse, the dump/reload problems seems to depend on your >

Re: [HACKERS] XML with invalid chars

2011-05-08 Thread Andrew Dunstan
On 04/27/2011 11:41 PM, Noah Misch wrote: On Wed, Apr 27, 2011 at 11:22:37PM -0400, Andrew Dunstan wrote: On 04/27/2011 05:30 PM, Noah Misch wrote: To make things worse, the dump/reload problems seems to depend on your version of libxml2, or something. With git master, a CentOS 5 system with

Re: [HACKERS] XML with invalid chars

2011-04-28 Thread Andrew Dunstan
On 04/27/2011 05:30 PM, Noah Misch wrote: I'm not sure what to do about the back branches and cases where data is already in databases. This is fairly ugly. Suggestions welcome. We could provide a script in (or linked from) the release notes for testing the data in all your xml columns. H

Re: [HACKERS] XML with invalid chars

2011-04-27 Thread Noah Misch
On Wed, Apr 27, 2011 at 11:22:37PM -0400, Andrew Dunstan wrote: > On 04/27/2011 05:30 PM, Noah Misch wrote: >> To make things worse, the dump/reload problems seems to depend on your >> version >> of libxml2, or something. With git master, a CentOS 5 system with >> 2.6.26-2.1.2.8.el5_5.1 accepts t

Re: [HACKERS] XML with invalid chars

2011-04-27 Thread Andrew Dunstan
On 04/27/2011 05:30 PM, Noah Misch wrote: I'm not sure what to do about the back branches and cases where data is already in databases. This is fairly ugly. Suggestions welcome. We could provide a script in (or linked from) the release notes for testing the data in all your xml columns. Ye

Re: [HACKERS] XML with invalid chars

2011-04-27 Thread Noah Misch
On Wed, Apr 27, 2011 at 03:05:30PM -0400, Andrew Dunstan wrote: > On 04/26/2011 05:11 PM, Noah Misch wrote: >> On Mon, Apr 25, 2011 at 07:25:02PM -0400, Andrew Dunstan wrote: >>> I came across this today, while helping a customer. The following will >>> happily create a piece of XML with an embedde

Re: [HACKERS] XML with invalid chars

2011-04-27 Thread Andrew Dunstan
On 04/26/2011 05:11 PM, Noah Misch wrote: On Mon, Apr 25, 2011 at 07:25:02PM -0400, Andrew Dunstan wrote: I came across this today, while helping a customer. The following will happily create a piece of XML with an embedded ^A: select xmlelement(name foo, null, E'abc\x01def'); Now, a ^A

Re: [HACKERS] XML with invalid chars

2011-04-26 Thread Peter Eisentraut
On mån, 2011-04-25 at 19:25 -0400, Andrew Dunstan wrote: > I came across this today, while helping a customer. The following > will > happily create a piece of XML with an embedded ^A: > > select xmlelement(name foo, null, E'abc\x01def'); > > Now, a ^A is totally forbidden in XML version 1.0

Re: [HACKERS] XML with invalid chars

2011-04-26 Thread Noah Misch
On Mon, Apr 25, 2011 at 07:25:02PM -0400, Andrew Dunstan wrote: > I came across this today, while helping a customer. The following will > happily create a piece of XML with an embedded ^A: > >select xmlelement(name foo, null, E'abc\x01def'); > > Now, a ^A is totally forbidden in XML version

[HACKERS] XML with invalid chars

2011-04-25 Thread Andrew Dunstan
I came across this today, while helping a customer. The following will happily create a piece of XML with an embedded ^A: select xmlelement(name foo, null, E'abc\x01def'); Now, a ^A is totally forbidden in XML version 1.0, and allowed but only as "" or equivalent in XML version 1.1, and n