On 24.07.25 21:23, Tom Lane wrote:
> Oh, wait ... the plot thickens! The above statement is true
> when testing on my Mac with libxml2 2.13.8 from MacPorts.
> With either HEAD or f68d6aabb7e2^, I get errors similar to
> what Erik just showed:
>
> ERROR: invalid XML content
> DETAIL: line 1: Resource limit exceeded: Text node too long, try
> XML_PARSE_HUGE
> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
I get the same error with libxml2 2.9.14 on Ubuntu.
> However, when testing on RHEL8 with libxml2 2.9.7, indeed
> I get "Huge input lookup" with our current code but no
> failure with f68d6aabb7e2^.
>
> The way I interpret these results is that in older libxml2 versions,
> xmlParseBalancedChunkMemory is missing an XML_ERR_RESOURCE_LIMIT check
> that does exist in newer versions. So even if we were to do some kind
> of reversion, it would only prevent the error in libxml2 versions that
> lack that check. And in those versions we'd probably be exposing
> ourselves to resource-exhaustion problems.
>
> On the whole I'm thinking more and more that we don't want to
> touch this. Our recommendation for processing multi-megabyte
> chunks of XML should be "don't". Unless we want to find or
> write a replacement for libxml2 ... which we have discussed,
> but so far nothing's happened.
I also believe that addressing this limitation may not be worth the
associated risks. Moreover, a 10MB text node is rather large and
probably exceeds the needs of most users.
Best, Jim