Re: [xml] ParseChunk + SchemaSAXPlug
I forgot to add some parts of the code from the constructor and the methods. Maybe you find a mistake there. Regards, Christoph tremmel Am 18.09.2013 11:29, schrieb Tremmel: Hey, I am trying to use the chunk parsing together with the schema validation. The parser works when I switch off the validation. The validation works when I use a normal parser. When i use both together I get an access violation at xmlParseChunk. Am I missing something or is this a bug? Regards, Christoph Tremmel ActorParserCore::ActorParserCore () (...) { //SAX Handler xmlSAXHandler sax; memset(&sax,0,sizeof(xmlSAXHandler)); sax.initialized = XML_SAX2_MAGIC; sax.startElementNs = ActorParserCore::startElementNsSAX2Func; sax.endElementNs = ActorParserCore::endElementNsSAX2Func; sax.characters = ActorParserCore::charactersSAXFunc; sax.ignorableWhitespace = ActorParserCore::ignorableWhitespaceSAXFunc; //junkparser junkparser = xmlCreatePushParserCtxt(&sax, user_data,NULL,0,NULL); if ( junkparser == NULL ) { return; } (...) } bool ActorParserCore::setSchema(const char *schemafilepath) { if ( attachedschema != NULL) { xmlSchemaSAXUnplug(attachedschema); attachedschema = NULL; } if ( ctxt2 != NULL) { xmlSchemaFreeValidCtxt(ctxt2); ctxt2 = NULL; } if ( resource != NULL ) { xmlSchemaFree(resource); resource = NULL; } if ( schema != NULL ) { xmlSchemaFreeParserCtxt(schema); schema = NULL; } schema = xmlSchemaNewParserCtxt (schemafilepath); if ( schema == NULL ) { return false; } failed = false; xmlSchemaSetParserStructuredErrors(schema,ActorParserCore::handleParserErrors,this); xmlSchemaSetParserErrors(schema,ActorParserCore::validationError,ActorParserCore::validationWarning,this); resource = xmlSchemaParse (schema); if ( resource == NULL ) { xmlSchemaFreeParserCtxt(schema); schema = NULL; return false; } if ( failed ) { xmlSchemaFree(resource); xmlSchemaFreeParserCtxt(schema); resource = NULL; schema = NULL; return false; } #if 0 FILE * check = fopen("..\Parser\XSD\Test.xsd","w"); if ( check != NULL ) { xmlSchemaDump(check,resource); fclose(check); } #endif ctxt2 = xmlSchemaNewValidCtxt(resource); if ( ctxt2 == NULL) { xmlSchemaFree(resource); xmlSchemaFreeParserCtxt(schema); resource = NULL; schema = NULL; return false; } attachedschema = xmlSchemaSAXPlug (ctxt2, &(junkparser->sax), ((void **)&user_data)); if ( attachedschema == NULL) { xmlSchemaFreeValidCtxt(ctxt2); xmlSchemaFree(resource); xmlSchemaFreeParserCtxt(schema); ctxt2 = NULL; resource = NULL; schema = NULL; return false; } xmlSchemaSetValidStructuredErrors(ctxt2,ActorParserCore::handleParserErrors,this); xmlSchemaSetValidErrors(ctxt2,ActorParserCore::validationError,ActorParserCore::validationWarning,this); return true; } ssize_t ActorParserCore::parse (char *message, size_t messagelength, size_t count,std::string filename) { (...) if ( remainingjunks == 0 ) { if ( current_message != NULL ) { current_message->unref(); current_message = NULL; newmessage.pop(); } if ( count == 0 ) { return 1; } failed = false; success = 0; current_message = new ActorParserImpl::Message(*this); if ( current_message == NULL ) { return 0; } newmessage.push(current_message); if ( newmessage.empty() ) { current_message->unref(); current_message = NULL; return 0; } remainingjunks = count; xmlError * lasterr = xmlCtxtGetLastError(junkparser); if ( lasterr != NULL ) { xmlCtxtResetLastError(junkparser); xmlCtxtResetLastError(this); } lasterr = xmlCtxtGetLastError(this); if ( lasterr != NULL ) { xmlResetLastError(); } if ( parsererror != NULL ) { parsererror->unref(); parsererror = NULL; } xmlCtxtResetPush(junkparser, NULL, 0, filename.c_str(),NULL ); } if ( -- remainingjunks == 0 ) { if (( xmlParseChunk(junkparser, message,(int) messagelength, 1 ) != XML_ERR_OK ) || failed ) { if ( current_message != NULL ) {
[xml] libxml2 SAX python interface: Bug with Python3
Dear all, when using the libxml2 interface via xml.sax in Python3, I get an exception concerning str vs bytes. Please see the following minimal example: Python 3.2.5 (default, Aug 26 2013, 21:33:16) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import xml.sax >>> from io import BytesIO >>> saxparser = xml.sax.make_parser(["drv_libxml2"]) >>> source = xml.sax.xmlreader.InputSource() >>> source.setByteStream(BytesIO(b'')) >>> saxparser.parse(source) Traceback (most recent call last): File "", line 1, in File "/usr/lib64/python3.2/site-packages/drv_libxml2.py", line 223, in parse eltName = _d(reader.Name()) File "/usr/lib64/python3.2/site-packages/drv_libxml2.py", line 70, in _d return _decoder(s)[0] File "/usr/lib64/python3.2/encodings/utf_8.py", line 16, in decode return codecs.utf_8_decode(input, errors, True) TypeError: 'str' does not support the buffer interface A quick look reveals, that all(?) the reader.…()-calls already return an encoded string and hence this _d() is moot. But I don't know which side needs to be fixed. Regards, René P.S.: Please put me on CC on replies, I'm not subscribed. ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[xml] ParseChunk + SchemaSAXPlug
Hey, I am trying to use the chunk parsing together with the schema validation. The parser works when I switch off the validation. The validation works when I use a normal parser. When i use both together I get an access violation at xmlParseChunk. Am I missing something or is this a bug? Regards, Christoph Tremmel ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[xml] Patch to ignore xml charset
Hi. I'd like to have an ability to ignore xml charset. I suggest this patch, http://pastebin.com/BYxun3JY What do you think about that? ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[xml] incorrect RelaxNG error reporting
Hi everyone, in our project we've recently started using a RelaxNG schema to validate our XML documents through the lxml python bindings of libxml2. However sometimes the errors reported for invalid documents are very unhelpful and even we as developers get confused and have to spend a few minutes looking for what's actually wrong. To demonstrate I simplified our schema and an invalid xml document with a simple python script that I've appended to this email. The script is not needed, running xmllint --relaxng schema.rng test.xml will produce the same results. The error that libxml reports is: test.xml:3:0:ERROR:RELAXNGV:RELAXNG_ERR_EXTRACONTENT: Element interfaces has extra content: eth which is incorrect since the actual error is that the eth element is missing a mandatory attribute. What's also interesting is that if you completely remove the definition and use of the "define" element in the schema (the test.xml doesn't use it so it can stay the same). The error stack changes to: test.xml:3:0:ERROR:RELAXNGV:RELAXNG_ERR_ATTRVALID: Element eth failed to validate attributes test.xml:3:0:ERROR:RELAXNGV:RELAXNG_ERR_EXTRACONTENT: Element interfaces has extra content: eth Which is a reasonable error message, even though it would be a bit more user friendly if there was some kind of information about which attributes failed or are missing, but I can understand that... There are a few more scenarios where similar problems occur, I can describe them if needed, but to keep this email shorter I will ignore them for now. I've also found a few bug reports that describe similar situations, but since they've been last updated several years ago I first wanted to write here before reviving them. So I've done some digging around and figured out that all of these imprecise error reports are related to and so rules that can easily cause non-determinism. If the non-determinism is handled with some kind of backtracking these kind of problems could arise. The other way is to create a finite automaton that can always be determinized solving this problem. I looked through the libxml sources and found that in fact a finite automaton is created however I didn't find anything related to it's determinization so I'm assuming there isn't anything. I apologize if I've missed something but it's a fairly long source file... I want to ask if this is a bug you would find worth fixing or if the current behaviour is intended (since the bugs in the bug tracker are 5+ years old). If not I might consider fixing this myself but I would like at least some comments about if the implementation of the determinization would be possible to integrate with how the validation is currently handled. Thanks for your reply! Best regards, Ondrej Lichtner -- schema.rng: -- http://relaxng.org/ns/structure/1.0"; datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes";> -- test.xml: -- -- test.py: -- #!/usr/bin/python from lxml import etree from pprint import pprint relaxng_doc = etree.parse("schema.rng") schema = etree.RelaxNG(relaxng_doc) doc = etree.parse("test.xml") schema.validate(doc) pprint(schema.error_log) ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[xml] Problem with xmlAddChild
Hello, I'm currently hacking on the lightspark project (free Adobe Flash implementation), which uses libxml2 and libxml++. In ActionScript it is possible to add nodes in reverse order, e.g.: var xml1:XML=new XML(""); var xml2:XML=new XML(""); var xml3:XML=new XML(""); xml1.appendChild(xml2); xml2.appendChild(xml3); Now the problem is that xmlAddChild(xmlNodePtr parent, xmlNodePtr cur) merges adjacent text nodes and cur is freed (making it impossible to add something to cur later). It seems to work at first, but the memory gets corrupted (surprise!). With libxml++'s import_node there is no corruption, but the xml3-element doesn't get added at all. Is there a possible workaround where the node structure doesn't get touched by appendChild or do I have to re-implement that method (which probably would require accessing the underlying structures directly and hence be rather ugly). Thanks in advance for your help. ~Fabian ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[xml] Cannot access parsed values returned by xmlSchemaValidatePredefinedType()
Hello all, We are using libxml2 (version 2.9.1) to validate XML files according to an XML schema. The XML schema makes use of XML schema data types such as xs:float and hence we need to parse strings that adhere to the XML schema data type specification. I noticed that xmlschemastypes.c already implements this validation and makes it accessible via xmlSchemaValidatePredefinedType(). While this function returns a xmlSchemaValPtr pointer to the parsed data, it seems that the definition of the _xmlSchemaVal structure is hidden in the C file and cannot be accessed by clients of the library. Has this interface been intended? I propose to move the definition of _xmlSchemaVal and all of its depending types (xmlSchemaValDecimal etc.) in the public header file schemasInternals.h. Best regards, Matthias ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
[xml] logging a bug
Hi, the bug tracking system of libxml2 only mentions up to version 2.7.8 but I found one in 2.9.1. Where should I log it and how? The bug is that when xmlReadIO is the first call you do on the library it crashes because xmlInitParser hasn't been called yet. If I interpret the documentation correctly xmlInitParser isn't required when you don't process in multiple threads. Adding xmlInitParser(); at the beginning of xmlReadIO in parser.c seems to fix the issue (I found other methods using a similar construction). Regards, Bram ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
Re: [xml] Posting a patch to this list
On Tue, Nov 19, 2013 at 12:42:43PM +0100, Bjoern Hoehrmann wrote: > * Patrick Monnerat wrote: > >As a first-time mailer to this list, I tried yesterday to post a big > >(~250k) gzipped patch to introduce support for the OS/400 platform. > > > >I did create an account and I see this e-mail has been accepted by the > >list server, but it has never been sent back nor appears in the > >archives. > > That usually means it is awaiting moderation by a human moderator with > limited resources. indeed, just approved it as wel as a few other relevant messages some pending in the queue since September > Large attachments should never be sent to mailing > lists intended for discussions, in this case a better place would be to > attach it to a bug in Bugzilla, > > http://bugzilla.gnome.org/buglist.cgi?product=libxml2 > > And then posting a link to the mailing list with whatever information > useful for discussion here. the good point of bugzilla is that it automatically also send a mail to my gmail address raising the signal there is something to watch :-) Patrick, i will look, as promised, mail marked as unread so i won't be able to 'forget' it ! Daniel -- Daniel Veillard | Open Source and Standards, Red Hat veill...@redhat.com | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | virtualization library http://libvirt.org/ ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml
Re: [xml] Posting a patch to this list
Daniel Veillard wrote: > Patrick, i will look, as promised, mail marked as unread so i won't be able to 'forget' it ! Thanks Daniel: I did not know you where this list's moderator. There's no hurry for me: I just wanted to be sure the e-mail arrived and didn't get lost :-) Please prefer the patch at https://bugzilla.gnome.org/show_bug.cgi?id=712670. I'll eventually upload updated patches there. Thanks again. Cheers, Patrick ___ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml