On 2024-07-07 22:43 +0200, Tom Lane wrote:
> As far as the errcontext changes go: I think we have to just bite
> the bullet and accept them.  It looks like 2.13 has a completely
> different mechanism than prior versions for deciding when to issue
> XML_ERR_NOT_WELL_BALANCED.  And it's not even clear that it's wrong;
> for example, in our first failing case
> 
>  DETAIL:  line 1: xmlParseEntityRef: no name
>  <invalidentity>&</invalidentity>
>                  ^
> -line 1: chunk is not well balanced
> -<invalidentity>&</invalidentity>
> -                                ^
> 
> it's kind of hard to argue that the chunk isn't well-balanced.
> 
> So we can either suppress errdetails from the expected output,
> or set up an additional expected-file.  I'm leaning to the
> "\set VERBOSITY terse" solution.

+1 for  \set VERBOSITY terse  as a last resort.

But it looks to me as if "chunk is not well balanced" is just noise
because libxml2 reports more specific errors before that.  For example:

    SELECT xmlparse(content '<twoerrors>&idontexist;</unbalanced>');
    ERROR:  invalid XML content
    DETAIL:  line 1: Entity 'idontexist' not defined
    <twoerrors>&idontexist;</unbalanced>
                           ^
    line 1: Opening and ending tag mismatch: twoerrors line 1 and unbalanced
    <twoerrors>&idontexist;</unbalanced>
                                        ^
    line 1: chunk is not well balanced
    <twoerrors>&idontexist;</unbalanced>
                                        ^

Here, "Opening and ending tag mismatch" already covers the unbalanced
closing tag.

So how about just ignoring XML_ERR_NOT_WELL_BALANCED like in the
attached?  This also adds test cases for an unclosed tag because I
wanted to see if I can trigger just "chunk is not well balanced", but
without success.

    SELECT xmlparse(content '<unclosed>');
    ERROR:  invalid XML content
    DETAIL:  line 1: Premature end of data in tag unclosed line 1
    <unclosed>
              ^
    line 1: chunk is not well balanced
    <unclosed>
              ^

libxml2 2.13 doesn't report "chunk ..." here either.

There's also this more explicit test case for unbalanced tags:

    <parent><child></parent></child>

But I'm not sure if that's really necessary if we already have:

    <twoerrors>&idontexist;</unbalanced>

The error messages are the same, except for the additional entity error.

-- 
Erik
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 8893be5682..4f45c90f54 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -2114,6 +2114,17 @@ xml_errorHandler(void *data, PgXmlErrorPtr error)
 	switch (domain)
 	{
 		case XML_FROM_PARSER:
+			/*
+			 * Suppress errors about not well-balanced elements because libxml2
+			 * already reports more specific errors in those cases.  So, this
+			 * error is redundant noise.  Also, libxml2 2.13 no longer reports
+			 * this error in every case.
+			 */
+			if (error->code == XML_ERR_NOT_WELL_BALANCED)
+				return;
+
+			/* fall through */
+
 		case XML_FROM_NONE:
 		case XML_FROM_MEMORY:
 		case XML_FROM_IO:
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 6500cff885..d6a51f9e38 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -254,17 +254,11 @@ ERROR:  invalid XML content
 DETAIL:  line 1: xmlParseEntityRef: no name
 <invalidentity>&</invalidentity>
                 ^
-line 1: chunk is not well balanced
-<invalidentity>&</invalidentity>
-                                ^
 SELECT xmlparse(content '<undefinedentity>&idontexist;</undefinedentity>');
 ERROR:  invalid XML content
 DETAIL:  line 1: Entity 'idontexist' not defined
 <undefinedentity>&idontexist;</undefinedentity>
                              ^
-line 1: chunk is not well balanced
-<undefinedentity>&idontexist;</undefinedentity>
-                                               ^
 SELECT xmlparse(content '<invalidns xmlns=''&lt;''/>');
          xmlparse          
 ---------------------------
@@ -283,9 +277,6 @@ DETAIL:  line 1: Entity 'idontexist' not defined
 <twoerrors>&idontexist;</unbalanced>
                        ^
 line 1: Opening and ending tag mismatch: twoerrors line 1 and unbalanced
-<twoerrors>&idontexist;</unbalanced>
-                                    ^
-line 1: chunk is not well balanced
 <twoerrors>&idontexist;</unbalanced>
                                     ^
 SELECT xmlparse(content '<nosuchprefix:tag/>');
@@ -294,6 +285,19 @@ SELECT xmlparse(content '<nosuchprefix:tag/>');
  <nosuchprefix:tag/>
 (1 row)
 
+SELECT xmlparse(content '<unclosed>');
+ERROR:  invalid XML content
+DETAIL:  line 1: Premature end of data in tag unclosed line 1
+<unclosed>
+          ^
+SELECT xmlparse(content '<parent><child></parent></child>');
+ERROR:  invalid XML content
+DETAIL:  line 1: Opening and ending tag mismatch: child line 1 and parent
+<parent><child></parent></child>
+                        ^
+line 1: Opening and ending tag mismatch: parent line 1 and child
+<parent><child></parent></child>
+                                ^
 SELECT xmlparse(document '   ');
 ERROR:  invalid XML document
 DETAIL:  line 1: Start tag expected, '<' not found
@@ -352,6 +356,19 @@ SELECT xmlparse(document '<nosuchprefix:tag/>');
  <nosuchprefix:tag/>
 (1 row)
 
+SELECT xmlparse(document '<unclosed>');
+ERROR:  invalid XML document
+DETAIL:  line 1: Premature end of data in tag unclosed line 1
+<unclosed>
+          ^
+SELECT xmlparse(document '<parent><child></parent></child>');
+ERROR:  invalid XML document
+DETAIL:  line 1: Opening and ending tag mismatch: child line 1 and parent
+<parent><child></parent></child>
+                        ^
+line 1: Opening and ending tag mismatch: parent line 1 and child
+<parent><child></parent></child>
+                                ^
 SELECT xmlpi(name foo);
   xmlpi  
 ---------
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index 953bac09e4..15ccbe1d35 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -77,6 +77,8 @@ SELECT xmlparse(content '<invalidns xmlns=''&lt;''/>');
 SELECT xmlparse(content '<relativens xmlns=''relative''/>');
 SELECT xmlparse(content '<twoerrors>&idontexist;</unbalanced>');
 SELECT xmlparse(content '<nosuchprefix:tag/>');
+SELECT xmlparse(content '<unclosed>');
+SELECT xmlparse(content '<parent><child></parent></child>');
 
 SELECT xmlparse(document '   ');
 SELECT xmlparse(document 'abc');
@@ -87,6 +89,8 @@ SELECT xmlparse(document '<invalidns xmlns=''&lt;''/>');
 SELECT xmlparse(document '<relativens xmlns=''relative''/>');
 SELECT xmlparse(document '<twoerrors>&idontexist;</unbalanced>');
 SELECT xmlparse(document '<nosuchprefix:tag/>');
+SELECT xmlparse(document '<unclosed>');
+SELECT xmlparse(document '<parent><child></parent></child>');
 
 
 SELECT xmlpi(name foo);

Reply via email to