On Mon, Jul 22, 2019 at 11:59:42AM +0200, Jan Pokorný via xml wrote:
> On 27/06/19 14:15 +0200, Nikolai Weibull via xml wrote:
> > The following RELAX NG schema requires exponential running time when
> > matching attributes (15 attributes is where my computer begins to
> > show the symptoms, this may vary with your set-up):
> > 
> > a.rng:
> > 
> > <?xml version="1.0" encoding="UTF-8"?>
> > <grammar xmlns="http://relaxng.org/ns/structure/1.0";>
> >   <start>
> >     <element>
> >       <name>a</name>
> >       <group>
> >         <optional><attribute><name 
> > ns="">a</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">b</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">c</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">d</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">e</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">f</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">g</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">h</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">i</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">j</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">k</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">l</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">m</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">n</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">o</name><text/></attribute></optional>
> >       </group>
> >     </element>
> >   </start>
> > </grammar>
> > 
> > a.xml:
> > 
> > <?xml version="1.0" encoding="UTF-8"?>
> > <a a="1" b="2" c="3" d="4" e="5" f="6" g="7" h="8" i="9" j="10" k="11" 
> > l="12" m="13" n="14" o="15"/>
> > 
> > % time xmllint --noout --relaxng a.rng a.xml
> > real: 3.175, user: 3.147, system: 0.014 (99%)
> > 
> > Changing the group to an interleave, that is,
> > 
> > b.rng:
> > 
> > <?xml version="1.0" encoding="UTF-8"?>
> > <grammar xmlns="http://relaxng.org/ns/structure/1.0";>
> >   <start>
> >     <element>
> >       <name>a</name>
> >       <interleave>
> >         <optional><attribute><name 
> > ns="">a</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">b</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">c</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">d</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">e</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">f</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">g</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">h</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">i</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">j</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">k</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">l</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">m</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">n</name><text/></attribute></optional>
> >         <optional><attribute><name 
> > ns="">o</name><text/></attribute></optional>
> >       </interleave>
> >     </element>
> >   </start>
> > </grammar>
> > 
> > gives
> > 
> > % time xmllint --noout --relaxng b.rng a.xml
> > real: 0.008, user: 0.003, system: 0.002 (54%)
> > 
> > Making the attributes required instead of optional also fixes the issue:
> > 
> > c.rng:
> > 
> > <?xml version="1.0" encoding="UTF-8"?>
> > <grammar xmlns="http://relaxng.org/ns/structure/1.0";>
> >   <start>
> >     <element>
> >       <name>a</name>
> >       <group>
> >         <attribute><name ns="">a</name><text/></attribute>
> >         <attribute><name ns="">b</name><text/></attribute>
> >         <attribute><name ns="">c</name><text/></attribute>
> >         <attribute><name ns="">d</name><text/></attribute>
> >         <attribute><name ns="">e</name><text/></attribute>
> >         <attribute><name ns="">f</name><text/></attribute>
> >         <attribute><name ns="">g</name><text/></attribute>
> >         <attribute><name ns="">h</name><text/></attribute>
> >         <attribute><name ns="">i</name><text/></attribute>
> >         <attribute><name ns="">j</name><text/></attribute>
> >         <attribute><name ns="">k</name><text/></attribute>
> >         <attribute><name ns="">l</name><text/></attribute>
> >         <attribute><name ns="">m</name><text/></attribute>
> >         <attribute><name ns="">n</name><text/></attribute>
> >         <attribute><name ns="">o</name><text/></attribute>
> >       </group>
> >     </element>
> >   </start>
> > </grammar>
> > 
> > % time xmllint --noout --relaxng c.rng a.xml
> > real: 0.008, user: 0.003, system: 0.002 (55%)
> > 
> > The problem seems to be that xmlRelaxNGValidateDefinition keeps
> > adding states that I feel don’t need checking, but I can’t figure
> > out how to avoid this from happening.

  Hum, I'm way behind .... sorry

yes that sounds like there is a graph generation issue, best is to compile
the regexp support with things like DEBUG_REGEXP_GRAPH enabled and trace down
with pencil and paper what kind of graph it generates. That's how most of that 
code
ended up being debugged <grin/>

> 
> I can only nod to your analysis, ran into similar observations
> in the past, and they were likewise XML attributes (that parameters to
> fence device instances boil down to) related:

> https://bugzilla.redhat.com/show_bug.cgi?id=1224378#c7
> 
> Callgrind proved to be a useful tool to get better quantitative numbers.

  yup but that's way down

> > Daniel, do you have any input on this?  I’d gladly work on fixing
> > this, but I feel that I need some guidance as to what’s going wrong
> > and perhaps also how to fix it.
> 
> Would be nice to get libxml's RelaxNG validation support closer
> to parity with jing (the only other freely available validator I am
> aware of that can be directly used for a peer comparison), for sure.

  well jing use an orthogonal approach, derivating the document graph directly
while libxml2 uses validation structure derivation instead (when it can't 
compile
down to a regexp). In that case it should be regexp based.
  The two will performs massively differently on different problems.

Daniel

> _______________________________________________
> xml mailing list, project page  http://xmlsoft.org/
> xml@gnome.org
> https://mail.gnome.org/mailman/listinfo/xml


-- 
Daniel Veillard      | Red Hat Developers Tools http://developer.redhat.com/
veill...@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to