Re: [PATCH] Add pretty-printed XML output option

Tom Lane Tue, 14 Mar 2023 10:40:42 -0700

Jim Jones <[email protected]> writes:
> [ v22-0001-Add-pretty-printed-XML-output-option.patch ]


I poked at this for awhile and ran into a problem that I'm not sure
how to solve: it misbehaves for input with embedded DOCTYPE.

regression=# SELECT xmlserialize(DOCUMENT '<!DOCTYPE a><a/>' as text indent);
 xmlserialize 
--------------
 <!DOCTYPE a>+
 <a></a>     +
 
(1 row)

regression=# SELECT xmlserialize(CONTENT '<!DOCTYPE a><a/>' as text indent);
 xmlserialize 
--------------
 
(1 row)

The bad result for CONTENT is because xml_parse() decides to
parse_as_document, but xmlserialize_indent has no idea that happened
and tries to use the content_nodes list anyway.  I don't especially
care for the laissez faire "maybe we'll set *content_nodes and maybe
we won't" API you adopted for xml_parse, which seems to be contributing
to the mess.  We could pass back more info so that xmlserialize_indent
knows what really happened.  However, that won't fix the bogus output
for the DOCUMENT case.  Are we perhaps passing incorrect flags to
xmlSaveToBuffer?

                        regards, tom lane

Re: [PATCH] Add pretty-printed XML output option

Reply via email to