Susanne Oberhauser-Hirschoff <f...@suse.com> writes: > Salut Daniel, > > What's the use cases that do thousands of xi:includes of tiny xml > fragments, rendering the current tuning necessary? > > If that's real I could redo a patch with an option.
ok, below is a patch that adds an option --fixup-all-base-uris aka XML_PARSE_ALLBASEFIX, which allows the libxml2 / xmllint user to choose, if she considers xml:base fixup in the same path 'clutter' or a 'bug fix'. I still believe there is a reason the XInclude test cases do the clutter version, but if libxml2 usually is used in contexts the clutter is useless, this variant would give the exotic other users a chance to get what they need, too. Btw, in 2003 you wrote this... https://www.sourceware.org/ml/docbook/2003-03/msg00101.html > On Sun, Mar 09, 2003 at 02:15:55PM -0500, Elliotte Rusty Harold wrote: > > At 2:02 PM -0500 2/12/03, Daniel Veillard wrote: > > > > >It's rather libxml2 now comply to the XInclude requirement of adding such > > >an xml:base at the inclusion point (when the included resource is in > > >a different path ...) > > > > > > I'm looking at this for my XIncluder right now, and the requirement > > seems a little stronger to me. They don't even have to be in a > > different path. Suppose for example, > > http://www.example.com/docs/parent.xml includes > > http://www.example.com/docs/child.xml > > > > These two documents have different base URIs even though they have > > the same "path". Thus an xml:base attribute must be added at the > > inclusion point whenever parse="xml". The only possible exception > > would be when both the includer and the included document has null or > > empty base URIs, or perhaps when XPointers are involved and one > > xinclude element is including a different part of the same document. > > I tried to minimize the addition of xml:base when it could be avoided > in practice (i.e. if the absence of the xml:base would not generate > erroneous URI-References to URI computations). This was a deployment > trade-off that I will fix when XInclude and xml:base will get better > acceptance. > > Daniel Please let me know what you think, thx, S. commit 633f764813e8b552bf77e8b34c77d5642b028063 Author: Susanne Oberhauser <f...@suse.com> Date: Wed Apr 23 13:53:14 2014 +0000 do xml:base fixup on all xi:include roots, even if the new base is in the same directory diff --git a/doc/APIfiles.html b/doc/APIfiles.html index 65e004b..4dd5d04 100644 --- a/doc/APIfiles.html +++ b/doc/APIfiles.html @@ -612,6 +612,7 @@ A:link, A:visited, A:active { text-decoration: underline } <a href="html/libxml-parser.html#XML_PARSER_START">XML_PARSER_START</a><br /> <a href="html/libxml-parser.html#XML_PARSER_START_TAG">XML_PARSER_START_TAG</a><br /> <a href="html/libxml-parser.html#XML_PARSER_SYSTEM_LITERAL">XML_PARSER_SYSTEM_LITERAL</a><br /> +<a href="html/libxml-parser.html#XML_PARSE_ALLBASEFIX">XML_PARSE_ALLBASEFIX</a><br /> <a href="html/libxml-parser.html#XML_PARSE_BIG_LINES">XML_PARSE_BIG_LINES</a><br /> <a href="html/libxml-parser.html#XML_PARSE_COMPACT">XML_PARSE_COMPACT</a><br /> <a href="html/libxml-parser.html#XML_PARSE_DOM">XML_PARSE_DOM</a><br /> diff --git a/doc/APIsymbols.html b/doc/APIsymbols.html index c2b82e7..f2e6d18 100644 --- a/doc/APIsymbols.html +++ b/doc/APIsymbols.html @@ -594,6 +594,7 @@ A:link, A:visited, A:active { text-decoration: underline } <a href="html/libxml-xmlreader.html#XML_PARSER_SUBST_ENTITIES">XML_PARSER_SUBST_ENTITIES</a><br /> <a href="html/libxml-parser.html#XML_PARSER_SYSTEM_LITERAL">XML_PARSER_SYSTEM_LITERAL</a><br /> <a href="html/libxml-xmlreader.html#XML_PARSER_VALIDATE">XML_PARSER_VALIDATE</a><br /> +<a href="html/libxml-parser.html#XML_PARSE_ALLBASEFIX">XML_PARSE_ALLBASEFIX</a><br /> <a href="html/libxml-parser.html#XML_PARSE_BIG_LINES">XML_PARSE_BIG_LINES</a><br /> <a href="html/libxml-parser.html#XML_PARSE_COMPACT">XML_PARSE_COMPACT</a><br /> <a href="html/libxml-parser.html#XML_PARSE_DOM">XML_PARSE_DOM</a><br /> diff --git a/doc/devhelp/libxml2-parser.html b/doc/devhelp/libxml2-parser.html index 357c14a..6f114ab 100644 --- a/doc/devhelp/libxml2-parser.html +++ b/doc/devhelp/libxml2-parser.html @@ -311,6 +311,8 @@ void <a href="#xmlSetExternalEntityLoader">xmlSetExternalEntityLoader</a> (<a hr <a name="XML_PARSE_OLDSAX">XML_PARSE_OLDSAX</a> = 1048576 /* parse using SAX2 interface before 2.7.0 */ <a name="XML_PARSE_IGNORE_ENC">XML_PARSE_IGNORE_ENC</a> = 2097152 /* ignore internal document encoding hint */ <a name="XML_PARSE_BIG_LINES">XML_PARSE_BIG_LINES</a> = 4194304 /* Store big lines numbers in text PSVI field */ + <a name="XML_PARSE_ALLBASEFIX">XML_PARSE_ALLBASEFIX</a> = 8388608 /* do xml:base fixup for _all_ XINCLUDEs */ + }; </pre><p/> </div> diff --git a/doc/devhelp/libxml2.devhelp b/doc/devhelp/libxml2.devhelp index 282546a..cb85fb5 100644 --- a/doc/devhelp/libxml2.devhelp +++ b/doc/devhelp/libxml2.devhelp @@ -776,6 +776,7 @@ <function name="XML_PARSER_SYSTEM_LITERAL" link="libxml2-parser.html#XML_PARSER_SYSTEM_LITERAL"/> <function name="XML_PARSER_VALIDATE" link="libxml2-xmlreader.html#XML_PARSER_VALIDATE"/> <function name="XML_PARSE_BIG_LINES" link="libxml2-parser.html#XML_PARSE_BIG_LINES"/> + <function name="XML_PARSE_ALLBASEFIX" link="libxml2-parser.html#XML_PARSE_ALLBASEFIX"/> <function name="XML_PARSE_COMPACT" link="libxml2-parser.html#XML_PARSE_COMPACT"/> <function name="XML_PARSE_DOM" link="libxml2-parser.html#XML_PARSE_DOM"/> <function name="XML_PARSE_DTDATTR" link="libxml2-parser.html#XML_PARSE_DTDATTR"/> diff --git a/doc/html/libxml-parser.html b/doc/html/libxml-parser.html index 98123f7..8f7ede9 100644 --- a/doc/html/libxml-parser.html +++ b/doc/html/libxml-parser.html @@ -290,6 +290,7 @@ void <a href="#xmlParserInputDeallocate">xmlParserInputDeallocate</a> (<a href=" <a name="XML_PARSE_OLDSAX" id="XML_PARSE_OLDSAX">XML_PARSE_OLDSAX</a> = 1048576 : parse using SAX2 interface before 2.7.0 <a name="XML_PARSE_IGNORE_ENC" id="XML_PARSE_IGNORE_ENC">XML_PARSE_IGNORE_ENC</a> = 2097152 : ignore internal document encoding hint <a name="XML_PARSE_BIG_LINES" id="XML_PARSE_BIG_LINES">XML_PARSE_BIG_LINES</a> = 4194304 : Store big lines numbers in text PSVI field + <a name="XML_PARSE_ALLBASEFIX" id="XML_PARSE_ALLBASEFIX">XML_PARSE_ALLBASEFIX</a> = 8388608 : fixup xml:base uris for same directory includes, too } </pre><h3><a name="xmlSAXHandlerV1" id="xmlSAXHandlerV1">Structure xmlSAXHandlerV1</a></h3><pre class="programlisting">Structure xmlSAXHandlerV1<br />struct _xmlSAXHandlerV1 { <a href="libxml-parser.html#internalSubsetSAXFunc">internalSubsetSAXFunc</a> internalSubset diff --git a/doc/libxml2-api.xml b/doc/libxml2-api.xml index 45bceb5..c8ba483 100644 --- a/doc/libxml2-api.xml +++ b/doc/libxml2-api.xml @@ -720,6 +720,7 @@ <exports symbol='XML_WITH_OUTPUT' type='enum'/> <exports symbol='XML_PARSE_XINCLUDE' type='enum'/> <exports symbol='XML_PARSE_NOCDATA' type='enum'/> + <exports symbol='XML_PARSE_ALLBASEFIX' type='enum'/> <exports symbol='XML_PARSE_NOBASEFIX' type='enum'/> <exports symbol='XML_PARSE_BIG_LINES' type='enum'/> <exports symbol='XML_WITH_XINCLUDE' type='enum'/> @@ -5137,6 +5138,7 @@ crash if you try to modify the tree)'/> <enum name='XML_PARSE_DTDVALID' file='parser' value='16' type='xmlParserOption' info='validate with the DTD'/> <enum name='XML_PARSE_HUGE' file='parser' value='524288' type='xmlParserOption' info='relax any hardcoded limit from the parser'/> <enum name='XML_PARSE_IGNORE_ENC' file='parser' value='2097152' type='xmlParserOption' info='ignore internal document encoding hint'/> + <enum name='XML_PARSE_ALLBASEFIX' file='parser' value='8388608' type='xmlParserOption' info='do xml:base fixup for _all_ XINCLUDEs'/> <enum name='XML_PARSE_NOBASEFIX' file='parser' value='262144' type='xmlParserOption' info='do not fixup XINCLUDE xml:base uris'/> <enum name='XML_PARSE_NOBLANKS' file='parser' value='256' type='xmlParserOption' info='remove blank nodes'/> <enum name='XML_PARSE_NOCDATA' file='parser' value='16384' type='xmlParserOption' info='merge CDATA as text nodes'/> diff --git a/doc/libxml2-refs.xml b/doc/libxml2-refs.xml index b33d103..9351da9 100644 --- a/doc/libxml2-refs.xml +++ b/doc/libxml2-refs.xml @@ -588,6 +588,7 @@ <reference name='XML_PARSER_SUBST_ENTITIES' href='html/libxml-xmlreader.html#XML_PARSER_SUBST_ENTITIES'/> <reference name='XML_PARSER_SYSTEM_LITERAL' href='html/libxml-parser.html#XML_PARSER_SYSTEM_LITERAL'/> <reference name='XML_PARSER_VALIDATE' href='html/libxml-xmlreader.html#XML_PARSER_VALIDATE'/> + <reference name='XML_PARSE_ALLBASEFIX' href='html/libxml-parser.html#XML_PARSE_ALLBASEFIX'/> <reference name='XML_PARSE_BIG_LINES' href='html/libxml-parser.html#XML_PARSE_BIG_LINES'/> <reference name='XML_PARSE_COMPACT' href='html/libxml-parser.html#XML_PARSE_COMPACT'/> <reference name='XML_PARSE_DOM' href='html/libxml-parser.html#XML_PARSE_DOM'/> @@ -4189,6 +4190,7 @@ <ref name='XML_PARSER_SUBST_ENTITIES'/> <ref name='XML_PARSER_SYSTEM_LITERAL'/> <ref name='XML_PARSER_VALIDATE'/> + <ref name='XML_PARSE_ALLBASEFIX'/> <ref name='XML_PARSE_BIG_LINES'/> <ref name='XML_PARSE_COMPACT'/> <ref name='XML_PARSE_DOM'/> @@ -11383,6 +11385,7 @@ <ref name='XML_PARSER_START'/> <ref name='XML_PARSER_START_TAG'/> <ref name='XML_PARSER_SYSTEM_LITERAL'/> + <ref name='XML_PARSE_ALLBASEFIX'/> <ref name='XML_PARSE_BIG_LINES'/> <ref name='XML_PARSE_COMPACT'/> <ref name='XML_PARSE_DOM'/> diff --git a/include/libxml/parser.h b/include/libxml/parser.h index 3f5730d..e87280e 100644 --- a/include/libxml/parser.h +++ b/include/libxml/parser.h @@ -1111,7 +1111,8 @@ typedef enum { XML_PARSE_HUGE = 1<<19,/* relax any hardcoded limit from the parser */ XML_PARSE_OLDSAX = 1<<20,/* parse using SAX2 interface before 2.7.0 */ XML_PARSE_IGNORE_ENC= 1<<21,/* ignore internal document encoding hint */ - XML_PARSE_BIG_LINES = 1<<22 /* Store big lines numbers in text PSVI field */ + XML_PARSE_BIG_LINES = 1<<22,/* Store big lines numbers in text PSVI field */ + XML_PARSE_ALLBASEFIX= 1<<23 /* do xml:base fixup for _all_ XINCLUDEs */ } xmlParserOption; XMLPUBFUN void XMLCALL diff --git a/parser.c b/parser.c index ee429f3..1acc5a2 100644 --- a/parser.c +++ b/parser.c @@ -15111,6 +15111,14 @@ xmlCtxtUseOptionsInternal(xmlParserCtxtPtr ctxt, int options, const char *encodi ctxt->options |= XML_PARSE_NOBASEFIX; options -= XML_PARSE_NOBASEFIX; } + if (options & XML_PARSE_ALLBASEFIX) { + /* + * There is no check for NOBASEFIX vs ALLBASEFIX. + * NOBASEFIX will override ALLBASEFIX. + */ + ctxt->options |= XML_PARSE_ALLBASEFIX; + options -= XML_PARSE_ALLBASEFIX; + } if (options & XML_PARSE_HUGE) { ctxt->options |= XML_PARSE_HUGE; options -= XML_PARSE_HUGE; diff --git a/xinclude.c b/xinclude.c index 107ac03..e90c4ab 100644 --- a/xinclude.c +++ b/xinclude.c @@ -1685,7 +1685,7 @@ loaded: #endif /* - * Do the xml:base fixup if needed + * Do the xml:base fixup as needed */ if ((doc != NULL) && (URL != NULL) && (!(ctxt->parseFlags & XML_PARSE_NOBASEFIX)) && @@ -1695,27 +1695,41 @@ loaded: xmlChar *curBase; /* - * The base is only adjusted if "necessary", i.e. if the xinclude node - * has a base specified, or the URL is relative + * The xml:base is adjusted as necessary. Possibly the + * xinclude node has a base specified? */ base = xmlGetNsProp(ctxt->incTab[nr]->ref, BAD_CAST "base", XML_XML_NAMESPACE); if (base == NULL) { /* - * No xml:base on the xinclude node, so we check whether the - * URI base is different than (relative to) the context base + * No xml:base on the xinclude node. Compute the base + * from the URL of the included document, if possible + * relative to the context base. See + * uri.c:xmlBuildRelativeURI for the relative/absolute + * magic. */ curBase = xmlBuildRelativeURI(URL, ctxt->base); if (curBase == NULL) { /* Error return */ xmlXIncludeErr(ctxt, ctxt->incTab[nr]->ref, XML_XINCLUDE_HREF_URI, "trying to build relative URI from %s\n", URL); + } else if (((ctxt->parseFlags & XML_PARSE_ALLBASEFIX)) || + ((doc->parseFlags & XML_PARSE_ALLBASEFIX)) || + xmlStrchr(curBase, (xmlChar) '/')) { + base = curBase; } else { - /* If the URI doesn't contain a slash, it's not relative */ - if (!xmlStrchr(curBase, (xmlChar) '/')) - xmlFree(curBase); - else - base = curBase; + /* + * The XML_PARSE_ALLBASEFIX flag is unset, so we do + * minimal fixup, and don't modify xml:base if new + * base shares the path with the parent. In that + * case, all URIs references within the included will + * lead to the same place, whether we fixup the + * xml:base or not. However we drop file changes in + * the same path. If you also need xml:base fixup for + * same path document changes, use + * XML_PARSE_ALLBASEFIX. + */ + xmlFree(curBase); } } if (base != NULL) { /* Adjustment may be needed */ diff --git a/xmllint.c b/xmllint.c index 26d8db1..ba7f5b1 100644 --- a/xmllint.c +++ b/xmllint.c @@ -3053,6 +3053,7 @@ static void usage(const char *name) { printf("\t--xinclude : do XInclude processing\n"); printf("\t--noxincludenode : same but do not generate XInclude nodes\n"); printf("\t--nofixup-base-uris : do not fixup xml:base uris\n"); + printf("\t--fixup-all-base-uris : fixup xml:base for same path, new document XInclude, too\n"); #endif printf("\t--loaddtd : fetch external DTD\n"); printf("\t--dtdattr : loaddtd + populate the tree with inherited attributes \n"); @@ -3280,6 +3281,13 @@ main(int argc, char **argv) { options |= XML_PARSE_XINCLUDE; options |= XML_PARSE_NOBASEFIX; } + else if ((!strcmp(argv[i], "-fixup-all-base-uris")) || + (!strcmp(argv[i], "--fixup-all-base-uris"))) { + xinclude++; + options |= XML_PARSE_XINCLUDE; + options |= XML_PARSE_ALLBASEFIX; + options ^= XML_PARSE_NOBASEFIX & options; + } #endif #ifdef LIBXML_OUTPUT_ENABLED #ifdef HAVE_ZLIB_H commit 44bd5c1a52b632502d2d9cd42255c19563d2a459 Author: Susanne Oberhauser <f...@suse.com> Date: Wed Apr 23 13:17:12 2014 +0000 Remove premature check on URI being relative (gives false negatives). This is Alexey Neumann's first fix to xml:base handling diff --git a/xinclude.c b/xinclude.c index ace005b..107ac03 100644 --- a/xinclude.c +++ b/xinclude.c @@ -1687,7 +1687,7 @@ loaded: /* * Do the xml:base fixup if needed */ - if ((doc != NULL) && (URL != NULL) && (xmlStrchr(URL, (xmlChar) '/')) && + if ((doc != NULL) && (URL != NULL) && (!(ctxt->parseFlags & XML_PARSE_NOBASEFIX)) && (!(doc->parseFlags & XML_PARSE_NOBASEFIX))) { xmlNodePtr node; -- Susanne Oberhauser SUSE LINUX Products GmbH +49-911-74053-574 Maxfeldstraße 5 Processes and Infrastructure 90409 Nürnberg GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) _______________________________________________ xml mailing list, project page http://xmlsoft.org/ xml@gnome.org https://mail.gnome.org/mailman/listinfo/xml