Hi, I'm trying to use libxml to extract NOTATION entries on an internal DTD, 
but am struggling.  I can't seem to find a way to get the DTD notations 
interleaved in amongst the other DTD elements.

Some background - my app (which contains some XML parsing/formatting 
functionality) is actually written in Objective-C, so I was originally using 
NSXMLDocument (DOM-based) but for some reason the notations property on 
NSXMLDTD is always nil (Apple suggests this is a libxml bug, but I am not yet 
convinced). Their suggestion was to use NSXMLParser (SAX-based) - which does 
actually return the notations, but the problem is that it doesn't fire an event 
indicating that parsing has entered the DOCTYPE, so if I have the following 
XML, I don't know whether comment2 is inside the DOCTYPE or outside.

        <?xml version="1.0" standalone="yes" ?>
        <!-- comment1 -->
        <!DOCTYPE xxx SYSTEM "XXX" [
                <!-- comment2 -->
                <!ENTITY blah SYSTEM "BLAH" NDATA note>
                <!NOTATION note PUBLIC "my notation">
        ]>
        <xxx>some text</xxx>
        
So, my next step is to fallback to libxml itself.  Exploring xmllint, I can see 
that the --format option does indeed find and print the notations (which is 
good), but I've noticed that it doesn't preserve the original order of the 
various entities. Its output for the above XML is (note that the NOTATION has 
been moved ahead of the comment/entity):

        <?xml version="1.0" standalone="yes"?>
        <!-- comment1 -->
        <!DOCTYPE xxx SYSTEM "XXX" [
        <!NOTATION note PUBLIC "my notation" >
        <!-- comment2 --><!ENTITY blah SYSTEM "BLAH" NDATA note>
        ]>
        <xxx>some text</xxx>

More digging reveals the xmlDumpNotationTable() function - which looks like it 
ultimately calls an opaque hash table scanner wherein I pass in a function 
pointer.  OK, maybe that is what I need to do to iterate over the notations?  
Some more wandering through the code leads me xmlDtdDumpOutput() - which says 
"Dump the notations first as they are not in the DTD children list".

That seems odd. Why aren't the notations treated as children?

Anyway, I've tried using xmlCtxtReadFile() and traversing the resulting 
xmlDocPtr/xmlDtdPtr objects, but can't find a way to get to the notations.  
I've also tried xmlNewTextReaderFilename() but it only seems to traverse the 
XML elements, not the internal DTD.

Is there something I've missed?  If notations aren't added as children, I'm not 
sure how to get back a correctly sequenced set of elements (including 
notations).  Do I really need to drop all the way back to implementing my own 
SAX event handler in order to preserve the list of notations?  Or have I 
totally missed something obvious?

Any advise would be much appreciated (sorry for the long-winded post, but I 
wanted to cover off what I've already tried).

Cheers,
Craig
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to