Thanks Mukul for the implementation insights. Duplicating the tree does indeed explain a lot – this is a thing that Saxon EE is somehow handling differently in its implementation (which is quite fast), when I compare it directly.
In general I always feel that XSD 1.1 adoption (and using assertions) is not that widespread when I talk to other XML users/devs so I can understand the incentive for improving this are quite non-existent. I will see if I can find a way around this limitation. Thanks, Daniel Von: Mukul Gandhi <muk...@apache.org> Gesendet: Samstag, 27. März 2021 06:32 An: j-users@xerces.apache.org Betreff: Re: Java Heap Space problems with XSD 1.1 validation, asserts and large files On Thu, Mar 25, 2021 at 5:13 PM Zimmel, Daniel <d.zim...@esvmedien.de<mailto:d.zim...@esvmedien.de>> wrote: My XML file is deeply nested and has 440.000 lines when indented. I hope, you mean that, your XML file has 440000 lines. Anyhow, when I change the XSD version to 1.1 and insert a sample assertion (xsd:assert test="false()") in the content model for my root element, my CPU and memory are filling up quite fast, even giving me a Heap Space Error. That's an expected behaviour with Xerces. The Xerces XSD 1.1 implementation, constructs an XML in-memory DOM/XDM tree for (each) <xsd:assert>, which is rooted at an XML instance element that is validated by a xsd:complexType that has an <xsd:assert>. This is to say that, <xsd:assert> implementation is memory hungry for large XML instance documents that are validated by <xsd:assert> for XML elements on/near root of the XML instance tree, and also particularly when the <xsd:assert> XML instance tree is deeply nested. Some of the measures that I could advise, for issues described by you are following, 1) If possible, use IDC constraints or CTA, instead of <xsd:assert>. Or, use any other non <xsd:assert> XSD constructs for validation. 2) Do part of XML instance validation, within your client code that is invoking Xerces XSD 1.1 validation. 3) Try using the JVM options -Xms and -Xmx, to tune the heap memory to best extent. If possible (if it's a production and profit making project), use more RAM on the workstation where XSD 1.1 validation is taking place. Should I file a JIRA bug issue? Its up to you. From my point of view, this issue won't likely result in Xerces XSD 1.1 implementation code improvements. -- Regards, Mukul Gandhi