Re: Regression with large XML data input

2025-07-28 Thread Erik Wienhold
On 2025-07-28 09:45 +0200, Jim Jones wrote: > > On 28.07.25 04:47, Michael Paquier wrote: > > I understand that from the point of view of a maintainer this is > > rather bad, but from the customer point of view the current > > situation is also bad to deal with in the scope of a minor upgrade, > >

Re: Regression with large XML data input

2025-07-28 Thread Jim Jones
On 28.07.25 04:47, Michael Paquier wrote: > I understand that from the point of view of a > maintainer this is rather bad, but from the customer point of view the > current situation is also bad to deal with in the scope of a minor > upgrade, because applications suddenly break. I totally get i

Re: Regression with large XML data input

2025-07-27 Thread David G. Johnston
On Sunday, July 27, 2025, Michael Paquier wrote: > > I am not going to argue against the commits that have reached > REL_18_STABLE to add compatibility for libxml2 2.13.X, we can leave > this stuff as-is and enforce stronger restrictions across all versions > of libxml2, letting users deal with ap

Re: Regression with large XML data input

2025-07-27 Thread Michael Paquier
On Sun, Jul 27, 2025 at 10:16:47PM -0400, Tom Lane wrote: > Michael Paquier writes: >> This sentence is incorrect after I have double-checked the behaviors I >> am seeing based on local builds of libxml2 2.9.7 and 2.9.14. > > Hmm, okay, I misread Jim's results then. But there still remains > the

Re: Regression with large XML data input

2025-07-27 Thread Tom Lane
Michael Paquier writes: > On Fri, Jul 25, 2025 at 02:21:26PM -0400, Tom Lane wrote: >> However, from Jim Jones' result upthread, a "minor update" of libxml2 >> could also have caused this problem: 2.9.7 and 2.9.14 behave >> differently. So we don't have sole control --- or sole responsibility >>

Re: Regression with large XML data input

2025-07-27 Thread Michael Paquier
On Fri, Jul 25, 2025 at 02:21:26PM -0400, Tom Lane wrote: > I'll be the first to say that I'm not too pleased with it either. > However, from Jim Jones' result upthread, a "minor update" of libxml2 > could also have caused this problem: 2.9.7 and 2.9.14 behave > differently. So we don't have sole

Re: Regression with large XML data input

2025-07-25 Thread Tom Lane
Robert Treat writes: > On Thu, Jul 24, 2025 at 8:08 PM Michael Paquier wrote: >> If it were discussing things from the perspective where this new code >> was added after a major version bump of Postgres, I would not argue >> much about that: breakages happen every year and users adapt their >> ap

Re: Regression with large XML data input

2025-07-25 Thread Robert Treat
On Thu, Jul 24, 2025 at 8:08 PM Michael Paquier wrote: > On Fri, Jul 25, 2025 at 01:25:48AM +0200, Jim Jones wrote: > > On 24.07.25 21:23, Tom Lane wrote: > >> However, when testing on RHEL8 with libxml2 2.9.7, indeed > >> I get "Huge input lookup" with our current code but no > >> failure with f6

Re: Regression with large XML data input

2025-07-24 Thread Michael Paquier
On Fri, Jul 25, 2025 at 01:25:48AM +0200, Jim Jones wrote: > On 24.07.25 21:23, Tom Lane wrote: >> However, when testing on RHEL8 with libxml2 2.9.7, indeed >> I get "Huge input lookup" with our current code but no >> failure with f68d6aabb7e2^. >> >> The way I interpret these results is that in ol

Re: Regression with large XML data input

2025-07-24 Thread Jim Jones
On 24.07.25 21:23, Tom Lane wrote: > Oh, wait ... the plot thickens! The above statement is true > when testing on my Mac with libxml2 2.13.8 from MacPorts. > With either HEAD or f68d6aabb7e2^, I get errors similar to > what Erik just showed: > > ERROR: invalid XML content > DETAIL: line 1: R

Re: Regression with large XML data input

2025-07-24 Thread Tom Lane
I wrote: > BTW, further testing shows that the same failure occurs at > f68d6aabb7e2^. So AFAICS, the answer as to why the behavior > changed there is that it didn't. Oh, wait ... the plot thickens! The above statement is true when testing on my Mac with libxml2 2.13.8 from MacPorts. With either

Re: Regression with large XML data input

2025-07-24 Thread Erik Wienhold
On 2025-07-24 05:12 +0200, Michael Paquier wrote: > Switching back to the previous code, where we rely on > xmlParseBalancedChunkMemory() fixes the issue. A quick POC is > attached. It fails one case in check-world with SERIALIZE because I > am not sure it is possible to pass down some options th

Re: Regression with large XML data input

2025-07-24 Thread Tom Lane
Michael Paquier writes: >>> A customer has reported a regression with the parsing of rather large >>> XML data, introduced by the set of backpatches done with f68d6aabb7e2 >>> & friends. BTW, further testing shows that the same failure occurs at f68d6aabb7e2^. So AFAICS, the answer as to why the

Re: Regression with large XML data input

2025-07-24 Thread Erik Wienhold
On 2025-07-24 20:10 +0200, Tom Lane wrote: > The supplied test case hides important details in the error message. > If you get rid of the exception block so that the error is reported > in full, what you see is > > regression=# CREATE TEMP TABLE xmldata (id BIGINT PRIMARY KEY, message XML ); > CRE

Re: Regression with large XML data input

2025-07-24 Thread Tom Lane
I wrote: > Michael Paquier writes: >> A customer has reported a regression with the parsing of rather large >> XML data, introduced by the set of backpatches done with f68d6aabb7e2 >> & friends. > Bleah. The supplied test case hides important details in the error message. If you get rid of the e

Re: Regression with large XML data input

2025-07-23 Thread Tom Lane
Michael Paquier writes: > Are you planning to look at that for the next minor release? It would > take me a couple of hours to dig into all that, and I am sure that I > am going to need your stamp or Erik's to avoid doing a stupid thing. It was my commit, so my responsibility, so I'll deal with

Re: Regression with large XML data input

2025-07-23 Thread Michael Paquier
On Wed, Jul 23, 2025 at 11:28:38PM -0400, Tom Lane wrote: > Michael Paquier writes: >> Switching back to the previous code, where we rely on >> xmlParseBalancedChunkMemory() fixes the issue. > > Yeah, just reverting these commits might be an acceptable answer, > since the main point was to work a

Re: Regression with large XML data input

2025-07-23 Thread Tom Lane
Michael Paquier writes: > A customer has reported a regression with the parsing of rather large > XML data, introduced by the set of backpatches done with f68d6aabb7e2 > & friends. Bleah. > Switching back to the previous code, where we rely on > xmlParseBalancedChunkMemory() fixes the issue. Ye