Re: [xml] ParseChunk + SchemaSAXPlug

2013-11-21 Thread Tremmel
I forgot to add some parts of the code from the constructor and the 
methods. Maybe you find a mistake there.


Regards, Christoph tremmel




Am 18.09.2013 11:29, schrieb Tremmel:

Hey,

I am trying to use the chunk parsing together with the schema 
validation. The parser works when I switch off the validation. The 
validation works when I use a normal parser.


When i use both together I get an access violation at xmlParseChunk. 
Am I missing something or is this a bug?


Regards, Christoph Tremmel


ActorParserCore::ActorParserCore ()
(...)
{   

 //SAX Handler

xmlSAXHandler sax;
memset(&sax,0,sizeof(xmlSAXHandler));
sax.initialized = XML_SAX2_MAGIC;
sax.startElementNs = ActorParserCore::startElementNsSAX2Func;
sax.endElementNs = ActorParserCore::endElementNsSAX2Func;
sax.characters = ActorParserCore::charactersSAXFunc;
sax.ignorableWhitespace = ActorParserCore::ignorableWhitespaceSAXFunc;

//junkparser

junkparser = xmlCreatePushParserCtxt(&sax, user_data,NULL,0,NULL);
if ( junkparser == NULL ) {
return;
}
(...)
}

bool ActorParserCore::setSchema(const char *schemafilepath) {
if ( attachedschema != NULL) {
xmlSchemaSAXUnplug(attachedschema);
attachedschema = NULL;
}
if ( ctxt2 != NULL) {
xmlSchemaFreeValidCtxt(ctxt2);
ctxt2 = NULL;
}
if ( resource != NULL ) {
xmlSchemaFree(resource);
resource = NULL;
}
if ( schema != NULL ) {
xmlSchemaFreeParserCtxt(schema);
schema = NULL;
}

schema = xmlSchemaNewParserCtxt (schemafilepath);

if ( schema == NULL ) {
return false;
}
failed = false;

xmlSchemaSetParserStructuredErrors(schema,ActorParserCore::handleParserErrors,this);

xmlSchemaSetParserErrors(schema,ActorParserCore::validationError,ActorParserCore::validationWarning,this);
resource = xmlSchemaParse (schema);
if ( resource == NULL ) {
xmlSchemaFreeParserCtxt(schema);
schema = NULL;
return false;
}
if ( failed ) {
xmlSchemaFree(resource);
xmlSchemaFreeParserCtxt(schema);
resource = NULL;
schema = NULL;
return false;
}

#if 0
FILE * check = fopen("..\Parser\XSD\Test.xsd","w");

if ( check != NULL ) {
xmlSchemaDump(check,resource);
fclose(check);
}
#endif
ctxt2 = xmlSchemaNewValidCtxt(resource);

if ( ctxt2 == NULL) {
xmlSchemaFree(resource);
xmlSchemaFreeParserCtxt(schema);
resource = NULL;
schema = NULL;
return false;
}

   attachedschema = xmlSchemaSAXPlug (ctxt2, &(junkparser->sax), ((void 
**)&user_data));
if ( attachedschema == NULL) {
xmlSchemaFreeValidCtxt(ctxt2);
xmlSchemaFree(resource);
xmlSchemaFreeParserCtxt(schema);
ctxt2 = NULL;
resource = NULL;
schema = NULL;
return false;
}


xmlSchemaSetValidStructuredErrors(ctxt2,ActorParserCore::handleParserErrors,this);

xmlSchemaSetValidErrors(ctxt2,ActorParserCore::validationError,ActorParserCore::validationWarning,this);

return true;
}

ssize_t ActorParserCore::parse (char *message, size_t messagelength, size_t 
count,std::string filename) {

(...)

if ( remainingjunks == 0 ) {
 
if ( current_message != NULL ) {
current_message->unref();
current_message = NULL;
newmessage.pop();
}
if ( count == 0 ) {
return 1;
}
failed = false;
success = 0;

current_message = new ActorParserImpl::Message(*this);
if ( current_message == NULL ) {
return 0;
}
newmessage.push(current_message);
if ( newmessage.empty() ) {
current_message->unref();
current_message = NULL;
return 0;
}
remainingjunks = count;

xmlError * lasterr = xmlCtxtGetLastError(junkparser);

if ( lasterr != NULL ) {
xmlCtxtResetLastError(junkparser);
xmlCtxtResetLastError(this);
}
lasterr = xmlCtxtGetLastError(this);
if ( lasterr != NULL ) {
xmlResetLastError();
}
if ( parsererror != NULL ) {
parsererror->unref();
parsererror = NULL;
}
xmlCtxtResetPush(junkparser, NULL, 0, filename.c_str(),NULL );
}

if ( -- remainingjunks == 0 ) {
if (( xmlParseChunk(junkparser, message,(int) messagelength, 1 ) != 
XML_ERR_OK ) 
 || failed ) {
if ( current_message != NULL ) {

[xml] libxml2 SAX python interface: Bug with Python3

2013-11-21 Thread René Neumann
Dear all,

when using the libxml2 interface via xml.sax in Python3, I get an
exception concerning str vs bytes. Please see the following minimal example:

Python 3.2.5 (default, Aug 26 2013, 21:33:16)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import xml.sax
>>> from io import BytesIO
>>> saxparser = xml.sax.make_parser(["drv_libxml2"])
>>> source = xml.sax.xmlreader.InputSource()
>>> source.setByteStream(BytesIO(b''))
>>> saxparser.parse(source)
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/lib64/python3.2/site-packages/drv_libxml2.py", line 223, in
parse
eltName = _d(reader.Name())
  File "/usr/lib64/python3.2/site-packages/drv_libxml2.py", line 70, in _d
return _decoder(s)[0]
  File "/usr/lib64/python3.2/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)
TypeError: 'str' does not support the buffer interface


A quick look reveals, that all(?) the reader.…()-calls already return an
encoded string and hence this _d() is moot. But I don't know which side
needs to be fixed.

Regards,
René

P.S.: Please put me on CC on replies, I'm not subscribed.
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


[xml] ParseChunk + SchemaSAXPlug

2013-11-21 Thread Tremmel

Hey,

I am trying to use the chunk parsing together with the schema 
validation. The parser works when I switch off the validation. The 
validation works when I use a normal parser.


When i use both together I get an access violation at xmlParseChunk. Am 
I missing something or is this a bug?


Regards, Christoph Tremmel
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


[xml] Patch to ignore xml charset

2013-11-21 Thread Filipp Bakanov
Hi. I'd like to have an ability to ignore xml charset. I suggest this
patch, http://pastebin.com/BYxun3JY What do you think about that?
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


[xml] incorrect RelaxNG error reporting

2013-11-21 Thread Ondrej Lichtner
Hi everyone,

in our project we've recently started using a RelaxNG schema to validate
our XML documents through the lxml python bindings of libxml2. However
sometimes the errors reported for invalid documents are very unhelpful
and even we as developers get confused and have to spend a few minutes
looking for what's actually wrong. To demonstrate I simplified our
schema and an invalid xml document with a simple python script that I've
appended to this email. The script is not needed, running
xmllint --relaxng schema.rng test.xml
will produce the same results.

The error that libxml reports is:
test.xml:3:0:ERROR:RELAXNGV:RELAXNG_ERR_EXTRACONTENT: Element interfaces has 
extra content: eth

which is incorrect since the actual error is that the eth element is
missing a mandatory attribute.

What's also interesting is that if you completely remove the definition
and use of the "define" element in the schema (the test.xml doesn't use
it so it can stay the same). The error stack changes to:
test.xml:3:0:ERROR:RELAXNGV:RELAXNG_ERR_ATTRVALID: Element eth failed to 
validate attributes
test.xml:3:0:ERROR:RELAXNGV:RELAXNG_ERR_EXTRACONTENT: Element interfaces has 
extra content: eth

Which is a reasonable error message, even though it would be a bit more
user friendly if there was some kind of information about which
attributes failed or are missing, but I can understand that...

There are a few more scenarios where similar problems occur, I can
describe them if needed, but to keep this email shorter I will ignore
them for now. I've also found a few bug reports that describe similar
situations, but since they've been last updated several years ago I
first wanted to write here before reviving them.

So I've done some digging around and figured out that all of these
imprecise error reports are related to   and
 so rules that can easily cause non-determinism. If the
non-determinism is handled with some kind of backtracking these kind of
problems could arise. The other way is to create a finite automaton that
can always be determinized solving this problem. I looked through the
libxml sources and found that in fact a finite automaton is created
however I didn't find anything related to it's determinization so I'm
assuming there isn't anything. I apologize if I've missed something but
it's a fairly long source file...

I want to ask if this is a bug you would find worth fixing or if the
current behaviour is intended (since the bugs in the bug tracker are 5+
years old).
If not I might consider fixing this myself but I would like at least
some comments about if the implementation of the determinization would
be possible to integrate with how the validation is currently handled.

Thanks for your reply!

Best regards,
Ondrej Lichtner

--
schema.rng:
--
http://relaxng.org/ns/structure/1.0";
datatypeLibrary="http://www.w3.org/2001/XMLSchema-datatypes";>




























































































--
test.xml:
--









--
test.py:
--
#!/usr/bin/python
from lxml import etree
from pprint import pprint

relaxng_doc = etree.parse("schema.rng")
schema = etree.RelaxNG(relaxng_doc)

doc = etree.parse("test.xml")
schema.validate(doc)
pprint(schema.error_log)
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


[xml] Problem with xmlAddChild

2013-11-21 Thread Fabian Ebner
Hello,

I'm currently hacking on the lightspark project (free Adobe Flash
implementation), which uses libxml2 and libxml++. In ActionScript it is
possible to add nodes in reverse order, e.g.:
var xml1:XML=new XML("");
var xml2:XML=new XML("");
var xml3:XML=new XML("");
xml1.appendChild(xml2);
xml2.appendChild(xml3);

Now the problem is that xmlAddChild(xmlNodePtr parent, xmlNodePtr cur)
merges adjacent text nodes and cur is freed (making it impossible to add
something to cur later). It seems to work at first, but the memory gets
corrupted (surprise!). With libxml++'s import_node there is no
corruption, but the xml3-element doesn't get added at all.

Is there a possible workaround where the node structure doesn't get
touched by appendChild or do I have to re-implement that method (which
probably would require accessing the underlying structures directly and
hence be rather ugly).

Thanks in advance for your help.

~Fabian
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


[xml] Cannot access parsed values returned by xmlSchemaValidatePredefinedType()

2013-11-21 Thread Matthias Lechner
Hello all,

 

We are using libxml2 (version 2.9.1) to validate XML files according to an XML 
schema. The XML schema makes use of XML schema data types such as xs:float and 
hence we need to parse strings that adhere to the XML schema data type 
specification. I noticed that xmlschemastypes.c already implements this 
validation and makes it accessible via xmlSchemaValidatePredefinedType(). While 
this function returns a xmlSchemaValPtr pointer to the parsed data, it seems 
that the definition of the _xmlSchemaVal structure is hidden in the C file and 
cannot be accessed by clients of the library. Has this interface been intended? 
I propose to move the definition of _xmlSchemaVal and all of its depending 
types (xmlSchemaValDecimal etc.) in the public header file schemasInternals.h.

 

Best regards,

Matthias
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


[xml] logging a bug

2013-11-21 Thread Tassyns, Bram



Hi, the bug tracking system of libxml2 only mentions up to version 2.7.8 but I found one in 2.9.1.
Where should I log it and how?


The bug is that when xmlReadIO is the first call you do on the library it crashes because xmlInitParser hasn't been called yet.
If I interpret the documentation correctly xmlInitParser isn't required when you don't process in multiple threads.
Adding 
xmlInitParser(); at the beginning of xmlReadIO in parser.c seems to fix the issue (I found other methods using a similar construction).


Regards,
Bram


___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Posting a patch to this list

2013-11-21 Thread Daniel Veillard
On Tue, Nov 19, 2013 at 12:42:43PM +0100, Bjoern Hoehrmann wrote:
> * Patrick Monnerat wrote:
> >As a first-time mailer to this list, I tried yesterday to post a big
> >(~250k) gzipped patch to introduce support for the OS/400 platform.
> > 
> >I did create an account and I see this e-mail has been accepted by the
> >list server, but it has never been sent back nor appears in the
> >archives.
> 
> That usually means it is awaiting moderation by a human moderator with
> limited resources.

 indeed, just approved it as wel as a few other relevant messages some
 pending in the queue since September 

> Large attachments should never be sent to mailing
> lists intended for discussions, in this case a better place would be to
> attach it to a bug in Bugzilla,
> 
>   http://bugzilla.gnome.org/buglist.cgi?product=libxml2
> 
> And then posting a link to the mailing list with whatever information
> useful for discussion here.

 the good point of bugzilla is that it automatically also send a mail
to my gmail address raising the signal there is something to watch :-)

 Patrick, i will look, as promised, mail marked as unread so i won't be
able to 'forget' it !

Daniel

-- 
Daniel Veillard  | Open Source and Standards, Red Hat
veill...@redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml


Re: [xml] Posting a patch to this list

2013-11-21 Thread Patrick Monnerat
Daniel Veillard wrote:

> Patrick, i will look, as promised, mail marked as unread so i won't be
able to 'forget' it !

Thanks Daniel: I did not know you where this list's moderator.
There's no hurry for me: I just wanted to be sure the e-mail arrived and
didn't get lost :-)

Please prefer the patch at
https://bugzilla.gnome.org/show_bug.cgi?id=712670. I'll eventually
upload updated patches there.

Thanks again.
Cheers,
Patrick
___
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml