On 2024-07-07 22:43 +0200, Tom Lane wrote: > As far as the errcontext changes go: I think we have to just bite > the bullet and accept them. It looks like 2.13 has a completely > different mechanism than prior versions for deciding when to issue > XML_ERR_NOT_WELL_BALANCED. And it's not even clear that it's wrong; > for example, in our first failing case > > DETAIL: line 1: xmlParseEntityRef: no name > <invalidentity>&</invalidentity> > ^ > -line 1: chunk is not well balanced > -<invalidentity>&</invalidentity> > - ^ > > it's kind of hard to argue that the chunk isn't well-balanced. > > So we can either suppress errdetails from the expected output, > or set up an additional expected-file. I'm leaning to the > "\set VERBOSITY terse" solution.
+1 for \set VERBOSITY terse as a last resort. But it looks to me as if "chunk is not well balanced" is just noise because libxml2 reports more specific errors before that. For example: SELECT xmlparse(content '<twoerrors>&idontexist;</unbalanced>'); ERROR: invalid XML content DETAIL: line 1: Entity 'idontexist' not defined <twoerrors>&idontexist;</unbalanced> ^ line 1: Opening and ending tag mismatch: twoerrors line 1 and unbalanced <twoerrors>&idontexist;</unbalanced> ^ line 1: chunk is not well balanced <twoerrors>&idontexist;</unbalanced> ^ Here, "Opening and ending tag mismatch" already covers the unbalanced closing tag. So how about just ignoring XML_ERR_NOT_WELL_BALANCED like in the attached? This also adds test cases for an unclosed tag because I wanted to see if I can trigger just "chunk is not well balanced", but without success. SELECT xmlparse(content '<unclosed>'); ERROR: invalid XML content DETAIL: line 1: Premature end of data in tag unclosed line 1 <unclosed> ^ line 1: chunk is not well balanced <unclosed> ^ libxml2 2.13 doesn't report "chunk ..." here either. There's also this more explicit test case for unbalanced tags: <parent><child></parent></child> But I'm not sure if that's really necessary if we already have: <twoerrors>&idontexist;</unbalanced> The error messages are the same, except for the additional entity error. -- Erik
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c index 8893be5682..4f45c90f54 100644 --- a/src/backend/utils/adt/xml.c +++ b/src/backend/utils/adt/xml.c @@ -2114,6 +2114,17 @@ xml_errorHandler(void *data, PgXmlErrorPtr error) switch (domain) { case XML_FROM_PARSER: + /* + * Suppress errors about not well-balanced elements because libxml2 + * already reports more specific errors in those cases. So, this + * error is redundant noise. Also, libxml2 2.13 no longer reports + * this error in every case. + */ + if (error->code == XML_ERR_NOT_WELL_BALANCED) + return; + + /* fall through */ + case XML_FROM_NONE: case XML_FROM_MEMORY: case XML_FROM_IO: diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out index 6500cff885..d6a51f9e38 100644 --- a/src/test/regress/expected/xml.out +++ b/src/test/regress/expected/xml.out @@ -254,17 +254,11 @@ ERROR: invalid XML content DETAIL: line 1: xmlParseEntityRef: no name <invalidentity>&</invalidentity> ^ -line 1: chunk is not well balanced -<invalidentity>&</invalidentity> - ^ SELECT xmlparse(content '<undefinedentity>&idontexist;</undefinedentity>'); ERROR: invalid XML content DETAIL: line 1: Entity 'idontexist' not defined <undefinedentity>&idontexist;</undefinedentity> ^ -line 1: chunk is not well balanced -<undefinedentity>&idontexist;</undefinedentity> - ^ SELECT xmlparse(content '<invalidns xmlns=''<''/>'); xmlparse --------------------------- @@ -283,9 +277,6 @@ DETAIL: line 1: Entity 'idontexist' not defined <twoerrors>&idontexist;</unbalanced> ^ line 1: Opening and ending tag mismatch: twoerrors line 1 and unbalanced -<twoerrors>&idontexist;</unbalanced> - ^ -line 1: chunk is not well balanced <twoerrors>&idontexist;</unbalanced> ^ SELECT xmlparse(content '<nosuchprefix:tag/>'); @@ -294,6 +285,19 @@ SELECT xmlparse(content '<nosuchprefix:tag/>'); <nosuchprefix:tag/> (1 row) +SELECT xmlparse(content '<unclosed>'); +ERROR: invalid XML content +DETAIL: line 1: Premature end of data in tag unclosed line 1 +<unclosed> + ^ +SELECT xmlparse(content '<parent><child></parent></child>'); +ERROR: invalid XML content +DETAIL: line 1: Opening and ending tag mismatch: child line 1 and parent +<parent><child></parent></child> + ^ +line 1: Opening and ending tag mismatch: parent line 1 and child +<parent><child></parent></child> + ^ SELECT xmlparse(document ' '); ERROR: invalid XML document DETAIL: line 1: Start tag expected, '<' not found @@ -352,6 +356,19 @@ SELECT xmlparse(document '<nosuchprefix:tag/>'); <nosuchprefix:tag/> (1 row) +SELECT xmlparse(document '<unclosed>'); +ERROR: invalid XML document +DETAIL: line 1: Premature end of data in tag unclosed line 1 +<unclosed> + ^ +SELECT xmlparse(document '<parent><child></parent></child>'); +ERROR: invalid XML document +DETAIL: line 1: Opening and ending tag mismatch: child line 1 and parent +<parent><child></parent></child> + ^ +line 1: Opening and ending tag mismatch: parent line 1 and child +<parent><child></parent></child> + ^ SELECT xmlpi(name foo); xmlpi --------- diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql index 953bac09e4..15ccbe1d35 100644 --- a/src/test/regress/sql/xml.sql +++ b/src/test/regress/sql/xml.sql @@ -77,6 +77,8 @@ SELECT xmlparse(content '<invalidns xmlns=''<''/>'); SELECT xmlparse(content '<relativens xmlns=''relative''/>'); SELECT xmlparse(content '<twoerrors>&idontexist;</unbalanced>'); SELECT xmlparse(content '<nosuchprefix:tag/>'); +SELECT xmlparse(content '<unclosed>'); +SELECT xmlparse(content '<parent><child></parent></child>'); SELECT xmlparse(document ' '); SELECT xmlparse(document 'abc'); @@ -87,6 +89,8 @@ SELECT xmlparse(document '<invalidns xmlns=''<''/>'); SELECT xmlparse(document '<relativens xmlns=''relative''/>'); SELECT xmlparse(document '<twoerrors>&idontexist;</unbalanced>'); SELECT xmlparse(document '<nosuchprefix:tag/>'); +SELECT xmlparse(document '<unclosed>'); +SELECT xmlparse(document '<parent><child></parent></child>'); SELECT xmlpi(name foo);