On Mon, Jul 22, 2019 at 11:59:42AM +0200, Jan Pokorný via xml wrote: > On 27/06/19 14:15 +0200, Nikolai Weibull via xml wrote: > > The following RELAX NG schema requires exponential running time when > > matching attributes (15 attributes is where my computer begins to > > show the symptoms, this may vary with your set-up): > > > > a.rng: > > > > <?xml version="1.0" encoding="UTF-8"?> > > <grammar xmlns="http://relaxng.org/ns/structure/1.0"> > > <start> > > <element> > > <name>a</name> > > <group> > > <optional><attribute><name > > ns="">a</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">b</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">c</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">d</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">e</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">f</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">g</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">h</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">i</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">j</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">k</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">l</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">m</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">n</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">o</name><text/></attribute></optional> > > </group> > > </element> > > </start> > > </grammar> > > > > a.xml: > > > > <?xml version="1.0" encoding="UTF-8"?> > > <a a="1" b="2" c="3" d="4" e="5" f="6" g="7" h="8" i="9" j="10" k="11" > > l="12" m="13" n="14" o="15"/> > > > > % time xmllint --noout --relaxng a.rng a.xml > > real: 3.175, user: 3.147, system: 0.014 (99%) > > > > Changing the group to an interleave, that is, > > > > b.rng: > > > > <?xml version="1.0" encoding="UTF-8"?> > > <grammar xmlns="http://relaxng.org/ns/structure/1.0"> > > <start> > > <element> > > <name>a</name> > > <interleave> > > <optional><attribute><name > > ns="">a</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">b</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">c</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">d</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">e</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">f</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">g</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">h</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">i</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">j</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">k</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">l</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">m</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">n</name><text/></attribute></optional> > > <optional><attribute><name > > ns="">o</name><text/></attribute></optional> > > </interleave> > > </element> > > </start> > > </grammar> > > > > gives > > > > % time xmllint --noout --relaxng b.rng a.xml > > real: 0.008, user: 0.003, system: 0.002 (54%) > > > > Making the attributes required instead of optional also fixes the issue: > > > > c.rng: > > > > <?xml version="1.0" encoding="UTF-8"?> > > <grammar xmlns="http://relaxng.org/ns/structure/1.0"> > > <start> > > <element> > > <name>a</name> > > <group> > > <attribute><name ns="">a</name><text/></attribute> > > <attribute><name ns="">b</name><text/></attribute> > > <attribute><name ns="">c</name><text/></attribute> > > <attribute><name ns="">d</name><text/></attribute> > > <attribute><name ns="">e</name><text/></attribute> > > <attribute><name ns="">f</name><text/></attribute> > > <attribute><name ns="">g</name><text/></attribute> > > <attribute><name ns="">h</name><text/></attribute> > > <attribute><name ns="">i</name><text/></attribute> > > <attribute><name ns="">j</name><text/></attribute> > > <attribute><name ns="">k</name><text/></attribute> > > <attribute><name ns="">l</name><text/></attribute> > > <attribute><name ns="">m</name><text/></attribute> > > <attribute><name ns="">n</name><text/></attribute> > > <attribute><name ns="">o</name><text/></attribute> > > </group> > > </element> > > </start> > > </grammar> > > > > % time xmllint --noout --relaxng c.rng a.xml > > real: 0.008, user: 0.003, system: 0.002 (55%) > > > > The problem seems to be that xmlRelaxNGValidateDefinition keeps > > adding states that I feel don’t need checking, but I can’t figure > > out how to avoid this from happening.
Hum, I'm way behind .... sorry yes that sounds like there is a graph generation issue, best is to compile the regexp support with things like DEBUG_REGEXP_GRAPH enabled and trace down with pencil and paper what kind of graph it generates. That's how most of that code ended up being debugged <grin/> > > I can only nod to your analysis, ran into similar observations > in the past, and they were likewise XML attributes (that parameters to > fence device instances boil down to) related: > https://bugzilla.redhat.com/show_bug.cgi?id=1224378#c7 > > Callgrind proved to be a useful tool to get better quantitative numbers. yup but that's way down > > Daniel, do you have any input on this? I’d gladly work on fixing > > this, but I feel that I need some guidance as to what’s going wrong > > and perhaps also how to fix it. > > Would be nice to get libxml's RelaxNG validation support closer > to parity with jing (the only other freely available validator I am > aware of that can be directly used for a peer comparison), for sure. well jing use an orthogonal approach, derivating the document graph directly while libxml2 uses validation structure derivation instead (when it can't compile down to a regexp). In that case it should be regexp based. The two will performs massively differently on different problems. Daniel > _______________________________________________ > xml mailing list, project page http://xmlsoft.org/ > xml@gnome.org > https://mail.gnome.org/mailman/listinfo/xml -- Daniel Veillard | Red Hat Developers Tools http://developer.redhat.com/ veill...@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml