On Thu, Mar 25, 2021 at 5:13 PM Zimmel, Daniel <d.zim...@esvmedien.de> wrote:
> My XML file is deeply nested and has 440.000 lines when indented. > I hope, you mean that, your XML file has 440000 lines. > Anyhow, when I change the XSD version to 1.1 and insert a sample assertion > (xsd:assert test="false()") in the content model for my root element, my > CPU and memory are filling up quite fast, even giving me a Heap Space Error. > That's an expected behaviour with Xerces. The Xerces XSD 1.1 implementation, constructs an XML in-memory DOM/XDM tree for (each) <xsd:assert>, which is rooted at an XML instance element that is validated by a xsd:complexType that has an <xsd:assert>. This is to say that, <xsd:assert> implementation is memory hungry for large XML instance documents that are validated by <xsd:assert> for XML elements on/near root of the XML instance tree, and also particularly when the <xsd:assert> XML instance tree is deeply nested. Some of the measures that I could advise, for issues described by you are following, 1) If possible, use IDC constraints or CTA, instead of <xsd:assert>. Or, use any other non <xsd:assert> XSD constructs for validation. 2) Do part of XML instance validation, within your client code that is invoking Xerces XSD 1.1 validation. 3) Try using the JVM options -Xms and -Xmx, to tune the heap memory to best extent. If possible (if it's a production and profit making project), use more RAM on the workstation where XSD 1.1 validation is taking place. Should I file a JIRA bug issue? > Its up to you. From my point of view, this issue won't likely result in Xerces XSD 1.1 implementation code improvements. -- Regards, Mukul Gandhi