I hadn't had the time to look at it, that simple ! Could be an interleave of interleave which is the problem, and that is too hard to fix in libxml2 that you may need to change the rng to avoid it. Grab me if you are in hurry,
Daniel On Fri, Nov 29, 2013 at 08:44:15AM +0100, Ondrej Lichtner wrote: > Hi again, > > any response on this? I was pointed to a couple of bugs on Red Hat > bugzilla through IRC suggesting that reimplementation of this feature > would require big changes to the code so I understand if nobody wants to > do it. I'm considering fixing this myself, but don't really want to > spend that much time on this either. However I would at least appreciate > a negative answer. > > Best regards, > Ondrej Lichtner > > On Mon, Nov 18, 2013 at 12:59:46PM +0100, Ondrej Lichtner wrote: > > Hi everyone, > > > > in our project we've recently started using a RelaxNG schema to validate > > our XML documents through the lxml python bindings of libxml2. However > > sometimes the errors reported for invalid documents are very unhelpful > > and even we as developers get confused and have to spend a few minutes > > looking for what's actually wrong. To demonstrate I simplified our > > schema and an invalid xml document with a simple python script that I've > > appended to this email. The script is not needed, running > > xmllint --relaxng schema.rng test.xml > > will produce the same results. > > > > The error that libxml reports is: > > test.xml:3:0:ERROR:RELAXNGV:RELAXNG_ERR_EXTRACONTENT: Element interfaces > > has extra content: eth > > > > which is incorrect since the actual error is that the eth element is > > missing a mandatory attribute. > > > > What's also interesting is that if you completely remove the definition > > and use of the "define" element in the schema (the test.xml doesn't use > > it so it can stay the same). The error stack changes to: > > test.xml:3:0:ERROR:RELAXNGV:RELAXNG_ERR_ATTRVALID: Element eth failed to > > validate attributes > > test.xml:3:0:ERROR:RELAXNGV:RELAXNG_ERR_EXTRACONTENT: Element interfaces > > has extra content: eth > > > > Which is a reasonable error message, even though it would be a bit more > > user friendly if there was some kind of information about which > > attributes failed or are missing, but I can understand that... > > > > There are a few more scenarios where similar problems occur, I can > > describe them if needed, but to keep this email shorter I will ignore > > them for now. I've also found a few bug reports that describe similar > > situations, but since they've been last updated several years ago I > > first wanted to write here before reviving them. > > > > So I've done some digging around and figured out that all of these > > imprecise error reports are related to <interleave> <optional> and > > <choice> so rules that can easily cause non-determinism. If the > > non-determinism is handled with some kind of backtracking these kind of > > problems could arise. The other way is to create a finite automaton that > > can always be determinized solving this problem. I looked through the > > libxml sources and found that in fact a finite automaton is created > > however I didn't find anything related to it's determinization so I'm > > assuming there isn't anything. I apologize if I've missed something but > > it's a fairly long source file... > > > > I want to ask if this is a bug you would find worth fixing or if the > > current behaviour is intended (since the bugs in the bug tracker are 5+ > > years old). > > If not I might consider fixing this myself but I would like at least > > some comments about if the implementation of the determinization would > > be possible to integrate with how the validation is currently handled. > > > > Thanks for your reply! > > > > Best regards, > > Ondrej Lichtner > > > > -------------------------------------- > > schema.rng: > > -------------------------------------- > > <grammar xmlns="http://relaxng.org/ns/structure/1.0" > > datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes"> > > <start> > > <element name="host"> > > <attribute name="id"/> > > > > <interleave> > > <zeroOrMore> > > <ref name="params"/> > > </zeroOrMore> > > > > <element name="interfaces"> > > <zeroOrMore> > > <ref name="eth"/> > > </zeroOrMore> > > </element> > > </interleave> > > </element> > > </start> > > > > <define name="define"> > > <element name="define"> > > <oneOrMore> > > <element name="alias"> > > <attribute name="name"/> > > <choice> > > <attribute name="value"/> > > <text/> > > </choice> > > </element> > > </oneOrMore> > > </element> > > </define> > > > > <define name="eth"> > > <element name="eth"> > > <attribute name="id"/> > > <attribute name="label"/> > > <interleave> > > <optional> > > <ref name="define"/> > > </optional> > > > > <zeroOrMore> > > <ref name="params"/> > > </zeroOrMore> > > > > <optional> > > <ref name="addresses"/> > > </optional> > > </interleave> > > </element> > > </define> > > > > <define name="addresses"> > > <element name="addresses"> > > <interleave> > > <optional> > > <ref name="define"/> > > </optional> > > > > <zeroOrMore> > > <element name="address"> > > <choice> > > <attribute name="value"/> > > <text/> > > </choice> > > </element> > > </zeroOrMore> > > </interleave> > > </element> > > </define> > > > > <define name="params"> > > <element name="params"> > > <interleave> > > <optional> > > <ref name="define"/> > > </optional> > > > > <zeroOrMore> > > <element name="param"> > > <attribute name="name"/> > > <choice> > > <attribute name="value"/> > > <text/> > > </choice> > > </element> > > </zeroOrMore> > > </interleave> > > </element> > > </define> > > </grammar> > > > > -------------------------------------- > > test.xml: > > -------------------------------------- > > <host id="slave1"> > > <interfaces> > > <eth label="A"> > > <addresses> > > <address value="192.168.100.1/24"/> > > </addresses> > > </eth> > > </interfaces> > > </host> > > -------------------------------------- > > test.py: > > -------------------------------------- > > #!/usr/bin/python > > from lxml import etree > > from pprint import pprint > > > > relaxng_doc = etree.parse("schema.rng") > > schema = etree.RelaxNG(relaxng_doc) > > > > doc = etree.parse("test.xml") > > schema.validate(doc) > > pprint(schema.error_log) > _______________________________________________ > xml mailing list, project page http://xmlsoft.org/ > xml@gnome.org > https://mail.gnome.org/mailman/listinfo/xml -- Daniel Veillard | Open Source and Standards, Red Hat veill...@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/ _______________________________________________ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml