On Jun1, 2011, at 03:17 , Florian Pflug wrote:
> My nagging suspicion is that libxml reports errors like there via some 
> callback function, and only returns a non-zero result if there are structural 
> errors in the XML. But my experience with libxml is pretty limited, so maybe 
> someone with more experience in this area can shed some light on this...

As it turns out, this is actually the case.

libxml reports some errors (like invalid xmlns attributes) via the error 
handler set using xmlSetGenericErrorFunc() but still returns zero (indicating 
success) from xmlCtxtReadDoc() and xmlParseBalancedChunkMemory().

If I modify xml_parse() to complain not only if one of these functions return 
non-zero, but also if xml_err_buf has non-zero length, invalid xmlns attributes 
are reported correctly.

However, the error function set using xmlSetGenericErrorFunc() cannot 
distinguish between error and warnings, so doing this causes XMLPARSE() to also 
complain about things like non-absolute namespace URIs (which are allowed but 
deprecated as far as I understand).

To fix that, xmlSetGenericErrorFunc() would probably have to be replace by 
xmlSetStructuredErrorFunc(). Structured error functions receive a pointer to an 
xmlError structore which, amongst other things, contains an xmlErrorLevel 
(NONE, WARNING, ERROR, FATAL).

While digging through the code in src/backend/utils/adt/xml.c, I also noticed 
that we set a global error handler instead of a per-context one. I guess this 
is because xmlParseBalancedChunkMemory(), which we use to parse XML fragments, 
doesn't provide a way to pass in a context but rather creates it itself. Still, 
I wonder if there isn't some other API which we could use which does allow us 
to specify a context. Again, it'd be nice if someone more familiar with this 
code could explain the reasons behind the current design.

Anyway, I'll try to come up with a patch that replaces xmlSetGenericErrorFunc() 
with xmlSetStructuredErrorFunc().

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to