Ășt 27. 8. 2024 v 13:57 odesĂlatel Jim Jones <jim.jo...@uni-muenster.de> napsal:
> > > On 26.08.24 16:59, Pavel Stehule wrote: > > > > 1. what about behaviour of NO INDENT - the implementation is not too > > old, so it can be changed if we want (I think), and it is better to do > > early than too late > > While checking the feasibility of removing indentation with NO INDENT I > may have found a bug in XMLSERIALIZE ... INDENT. > xmlSaveToBuffer seems to ignore elements if there are whitespaces > between them: > > SELECT xmlserialize(DOCUMENT '<foo><bar>42</bar></foo>' AS text INDENT); > xmlserialize > ----------------- > <foo> + > <bar>42</bar>+ > </foo> + > > (1 row) > > SELECT xmlserialize(DOCUMENT '<foo> <bar>42</bar> </foo>'::xml AS text > INDENT); > xmlserialize > ---------------------------- > <foo> <bar>42</bar> </foo>+ > > (1 row) > > I'll take a look at it. > +1 > Regarding removing indentation: yes, it would be possible with libxml2. > The question is if it would be right to do so. > > 2. Are we able to implement SQL/XML syntax with libxml2? > > > > 3. Are we able to implement Oracle syntax with libxml2? And there are > > benefits other than higher possible compatibility? > I guess it would be beneficial if you're migrating from oracle to > postgres - or the other way around. It certainly wouldn't hurt, but so > far I personally had little use for the oracle's extra xmlserialize > features. > > > > 4. Can there be some possible collision (functionality, syntax) with > > CANONICAL? > I couldn't find anything in the SQL/XML spec that might refer to > canonocal xml. > > > > 5. SQL/XML XMLSERIALIZE supports other target types than varchar. I > > can imagine XMLSERIALIZE with CANONICAL to bytea (then we don't need > > to force database encoding). Does it make sense? Are the results > > comparable? > | > As of pg16 bytea is not supported. Currently type| can be |character|, > |character varying|, or |text - also their other flavours like 'name'. > I know, but theoretically, there can be some benefit for CANONICAL if pg supports bytea there. Lot of databases still use non utf8 encoding. It is a more theoretical question - if pg supports different types there in future (because SQL/XML or Oracle), then CANONICAL can be used without limit, or CANONICAL can be used just for text? And you are sure, so you can compare text X text, instead xml X xml? +SELECT xmlserialize(CONTENT doc AS text CANONICAL) = xmlserialize(CONTENT doc AS text CANONICAL WITH COMMENTS) FROM xmltest_serialize; + ?column? +---------- + t + t +(2 rows) Maybe I am a little bit confused by these regress tests, because at the end it is not too useful - you compare two identical XML, and WITH COMMENTS and WITHOUT COMMENTS is tested elsewhere. I tried to search for a sense of this test. Better to use really different documents (columns) instead. Regards Pavel > > | > > -- > Jim > >