Susanne Oberhauser-Hirschoff <f...@suse.com> writes:

> Salut Daniel,
> 
> What's the use cases that do thousands of xi:includes of tiny xml
> fragments, rendering the current tuning necessary?
>
> If that's real I could redo a patch with an option.

ok, below is a patch that adds an option --fixup-all-base-uris aka
XML_PARSE_ALLBASEFIX, which allows the libxml2 / xmllint user to choose,
if she considers xml:base fixup in the same path 'clutter' or a 'bug
fix'.


I still believe there is a reason the XInclude test cases do the clutter
version, but if libxml2 usually is used in contexts the clutter is
useless, this variant would give the exotic other users a chance to get
what they need, too.


Btw, in 2003 you wrote this...

https://www.sourceware.org/ml/docbook/2003-03/msg00101.html

> On Sun, Mar 09, 2003 at 02:15:55PM -0500, Elliotte Rusty Harold wrote:
> > At 2:02 PM -0500 2/12/03, Daniel Veillard wrote:
> > 
> > >It's rather libxml2 now comply to the XInclude requirement of adding such
> > >an xml:base at the inclusion point (when the included resource is in
> > >a different path ...)
> > 
> > 
> > I'm looking at this for my XIncluder right now, and the requirement 
> > seems a little stronger to me. They don't even have to be in a 
> > different path. Suppose for example, 
> > http://www.example.com/docs/parent.xml includes 
> > http://www.example.com/docs/child.xml
> > 
> > These two documents have different base URIs even though they have 
> > the same "path". Thus an xml:base attribute must be added at the 
> > inclusion point whenever parse="xml". The only possible exception 
> > would be when both the includer and the included document has null or 
> > empty base URIs, or perhaps when XPointers are involved and one 
> > xinclude element is including a different part of the same document.
> 
>   I tried to minimize the addition of xml:base when it could be avoided
> in practice (i.e. if the absence of the xml:base would not generate
> erroneous URI-References to URI computations). This was a deployment
> trade-off that I will fix when XInclude and xml:base will get better
> acceptance.
> 
> Daniel


Please let me know what you think,

thx,


S.

commit 633f764813e8b552bf77e8b34c77d5642b028063
Author: Susanne Oberhauser <f...@suse.com>
Date:   Wed Apr 23 13:53:14 2014 +0000

    do xml:base fixup on all xi:include roots, even if the new base is in the 
same directory

diff --git a/doc/APIfiles.html b/doc/APIfiles.html
index 65e004b..4dd5d04 100644
--- a/doc/APIfiles.html
+++ b/doc/APIfiles.html
@@ -612,6 +612,7 @@ A:link, A:visited, A:active { text-decoration: underline }
 <a href="html/libxml-parser.html#XML_PARSER_START">XML_PARSER_START</a><br />
 <a 
href="html/libxml-parser.html#XML_PARSER_START_TAG">XML_PARSER_START_TAG</a><br 
/>
 <a 
href="html/libxml-parser.html#XML_PARSER_SYSTEM_LITERAL">XML_PARSER_SYSTEM_LITERAL</a><br
 />
+<a 
href="html/libxml-parser.html#XML_PARSE_ALLBASEFIX">XML_PARSE_ALLBASEFIX</a><br 
/>
 <a 
href="html/libxml-parser.html#XML_PARSE_BIG_LINES">XML_PARSE_BIG_LINES</a><br />
 <a href="html/libxml-parser.html#XML_PARSE_COMPACT">XML_PARSE_COMPACT</a><br />
 <a href="html/libxml-parser.html#XML_PARSE_DOM">XML_PARSE_DOM</a><br />
diff --git a/doc/APIsymbols.html b/doc/APIsymbols.html
index c2b82e7..f2e6d18 100644
--- a/doc/APIsymbols.html
+++ b/doc/APIsymbols.html
@@ -594,6 +594,7 @@ A:link, A:visited, A:active { text-decoration: underline }
 <a 
href="html/libxml-xmlreader.html#XML_PARSER_SUBST_ENTITIES">XML_PARSER_SUBST_ENTITIES</a><br
 />
 <a 
href="html/libxml-parser.html#XML_PARSER_SYSTEM_LITERAL">XML_PARSER_SYSTEM_LITERAL</a><br
 />
 <a 
href="html/libxml-xmlreader.html#XML_PARSER_VALIDATE">XML_PARSER_VALIDATE</a><br
 />
+<a 
href="html/libxml-parser.html#XML_PARSE_ALLBASEFIX">XML_PARSE_ALLBASEFIX</a><br 
/>
 <a 
href="html/libxml-parser.html#XML_PARSE_BIG_LINES">XML_PARSE_BIG_LINES</a><br />
 <a href="html/libxml-parser.html#XML_PARSE_COMPACT">XML_PARSE_COMPACT</a><br />
 <a href="html/libxml-parser.html#XML_PARSE_DOM">XML_PARSE_DOM</a><br />
diff --git a/doc/devhelp/libxml2-parser.html b/doc/devhelp/libxml2-parser.html
index 357c14a..6f114ab 100644
--- a/doc/devhelp/libxml2-parser.html
+++ b/doc/devhelp/libxml2-parser.html
@@ -311,6 +311,8 @@ void        <a 
href="#xmlSetExternalEntityLoader">xmlSetExternalEntityLoader</a>    (<a hr
     <a name="XML_PARSE_OLDSAX">XML_PARSE_OLDSAX</a> = 1048576 /* parse using 
SAX2 interface before 2.7.0 */
     <a name="XML_PARSE_IGNORE_ENC">XML_PARSE_IGNORE_ENC</a> = 2097152 /* 
ignore internal document encoding hint */
     <a name="XML_PARSE_BIG_LINES">XML_PARSE_BIG_LINES</a> = 4194304 /*  Store 
big lines numbers in text PSVI field */
+    <a name="XML_PARSE_ALLBASEFIX">XML_PARSE_ALLBASEFIX</a> = 8388608 /* do 
xml:base fixup for _all_ XINCLUDEs */
+
 };
 </pre><p/>
 </div>
diff --git a/doc/devhelp/libxml2.devhelp b/doc/devhelp/libxml2.devhelp
index 282546a..cb85fb5 100644
--- a/doc/devhelp/libxml2.devhelp
+++ b/doc/devhelp/libxml2.devhelp
@@ -776,6 +776,7 @@
     <function name="XML_PARSER_SYSTEM_LITERAL" 
link="libxml2-parser.html#XML_PARSER_SYSTEM_LITERAL"/>
     <function name="XML_PARSER_VALIDATE" 
link="libxml2-xmlreader.html#XML_PARSER_VALIDATE"/>
     <function name="XML_PARSE_BIG_LINES" 
link="libxml2-parser.html#XML_PARSE_BIG_LINES"/>
+    <function name="XML_PARSE_ALLBASEFIX" 
link="libxml2-parser.html#XML_PARSE_ALLBASEFIX"/>
     <function name="XML_PARSE_COMPACT" 
link="libxml2-parser.html#XML_PARSE_COMPACT"/>
     <function name="XML_PARSE_DOM" link="libxml2-parser.html#XML_PARSE_DOM"/>
     <function name="XML_PARSE_DTDATTR" 
link="libxml2-parser.html#XML_PARSE_DTDATTR"/>
diff --git a/doc/html/libxml-parser.html b/doc/html/libxml-parser.html
index 98123f7..8f7ede9 100644
--- a/doc/html/libxml-parser.html
+++ b/doc/html/libxml-parser.html
@@ -290,6 +290,7 @@ void        <a 
href="#xmlParserInputDeallocate">xmlParserInputDeallocate</a>        (<a href="
     <a name="XML_PARSE_OLDSAX" id="XML_PARSE_OLDSAX">XML_PARSE_OLDSAX</a> = 
1048576 : parse using SAX2 interface before 2.7.0
     <a name="XML_PARSE_IGNORE_ENC" 
id="XML_PARSE_IGNORE_ENC">XML_PARSE_IGNORE_ENC</a> = 2097152 : ignore internal 
document encoding hint
     <a name="XML_PARSE_BIG_LINES" 
id="XML_PARSE_BIG_LINES">XML_PARSE_BIG_LINES</a> = 4194304 : Store big lines 
numbers in text PSVI field
+    <a name="XML_PARSE_ALLBASEFIX" 
id="XML_PARSE_ALLBASEFIX">XML_PARSE_ALLBASEFIX</a> = 8388608 : fixup xml:base 
uris for same directory includes, too
 }
 </pre><h3><a name="xmlSAXHandlerV1" id="xmlSAXHandlerV1">Structure 
xmlSAXHandlerV1</a></h3><pre class="programlisting">Structure 
xmlSAXHandlerV1<br />struct _xmlSAXHandlerV1 {
     <a 
href="libxml-parser.html#internalSubsetSAXFunc">internalSubsetSAXFunc</a>       
internalSubset
diff --git a/doc/libxml2-api.xml b/doc/libxml2-api.xml
index 45bceb5..c8ba483 100644
--- a/doc/libxml2-api.xml
+++ b/doc/libxml2-api.xml
@@ -720,6 +720,7 @@
      <exports symbol='XML_WITH_OUTPUT' type='enum'/>
      <exports symbol='XML_PARSE_XINCLUDE' type='enum'/>
      <exports symbol='XML_PARSE_NOCDATA' type='enum'/>
+     <exports symbol='XML_PARSE_ALLBASEFIX' type='enum'/>
      <exports symbol='XML_PARSE_NOBASEFIX' type='enum'/>
      <exports symbol='XML_PARSE_BIG_LINES' type='enum'/>
      <exports symbol='XML_WITH_XINCLUDE' type='enum'/>
@@ -5137,6 +5138,7 @@ crash if you try to modify the tree)'/>
     <enum name='XML_PARSE_DTDVALID' file='parser' value='16' 
type='xmlParserOption' info='validate with the DTD'/>
     <enum name='XML_PARSE_HUGE' file='parser' value='524288' 
type='xmlParserOption' info='relax any hardcoded limit from the parser'/>
     <enum name='XML_PARSE_IGNORE_ENC' file='parser' value='2097152' 
type='xmlParserOption' info='ignore internal document encoding hint'/>
+    <enum name='XML_PARSE_ALLBASEFIX' file='parser' value='8388608' 
type='xmlParserOption' info='do xml:base fixup for _all_ XINCLUDEs'/>
     <enum name='XML_PARSE_NOBASEFIX' file='parser' value='262144' 
type='xmlParserOption' info='do not fixup XINCLUDE xml:base uris'/>
     <enum name='XML_PARSE_NOBLANKS' file='parser' value='256' 
type='xmlParserOption' info='remove blank nodes'/>
     <enum name='XML_PARSE_NOCDATA' file='parser' value='16384' 
type='xmlParserOption' info='merge CDATA as text nodes'/>
diff --git a/doc/libxml2-refs.xml b/doc/libxml2-refs.xml
index b33d103..9351da9 100644
--- a/doc/libxml2-refs.xml
+++ b/doc/libxml2-refs.xml
@@ -588,6 +588,7 @@
     <reference name='XML_PARSER_SUBST_ENTITIES' 
href='html/libxml-xmlreader.html#XML_PARSER_SUBST_ENTITIES'/>
     <reference name='XML_PARSER_SYSTEM_LITERAL' 
href='html/libxml-parser.html#XML_PARSER_SYSTEM_LITERAL'/>
     <reference name='XML_PARSER_VALIDATE' 
href='html/libxml-xmlreader.html#XML_PARSER_VALIDATE'/>
+    <reference name='XML_PARSE_ALLBASEFIX' 
href='html/libxml-parser.html#XML_PARSE_ALLBASEFIX'/>
     <reference name='XML_PARSE_BIG_LINES' 
href='html/libxml-parser.html#XML_PARSE_BIG_LINES'/>
     <reference name='XML_PARSE_COMPACT' 
href='html/libxml-parser.html#XML_PARSE_COMPACT'/>
     <reference name='XML_PARSE_DOM' 
href='html/libxml-parser.html#XML_PARSE_DOM'/>
@@ -4189,6 +4190,7 @@
       <ref name='XML_PARSER_SUBST_ENTITIES'/>
       <ref name='XML_PARSER_SYSTEM_LITERAL'/>
       <ref name='XML_PARSER_VALIDATE'/>
+      <ref name='XML_PARSE_ALLBASEFIX'/>
       <ref name='XML_PARSE_BIG_LINES'/>
       <ref name='XML_PARSE_COMPACT'/>
       <ref name='XML_PARSE_DOM'/>
@@ -11383,6 +11385,7 @@
       <ref name='XML_PARSER_START'/>
       <ref name='XML_PARSER_START_TAG'/>
       <ref name='XML_PARSER_SYSTEM_LITERAL'/>
+      <ref name='XML_PARSE_ALLBASEFIX'/>
       <ref name='XML_PARSE_BIG_LINES'/>
       <ref name='XML_PARSE_COMPACT'/>
       <ref name='XML_PARSE_DOM'/>
diff --git a/include/libxml/parser.h b/include/libxml/parser.h
index 3f5730d..e87280e 100644
--- a/include/libxml/parser.h
+++ b/include/libxml/parser.h
@@ -1111,7 +1111,8 @@ typedef enum {
     XML_PARSE_HUGE      = 1<<19,/* relax any hardcoded limit from the parser */
     XML_PARSE_OLDSAX    = 1<<20,/* parse using SAX2 interface before 2.7.0 */
     XML_PARSE_IGNORE_ENC= 1<<21,/* ignore internal document encoding hint */
-    XML_PARSE_BIG_LINES = 1<<22 /* Store big lines numbers in text PSVI field 
*/
+    XML_PARSE_BIG_LINES = 1<<22,/* Store big lines numbers in text PSVI field 
*/
+    XML_PARSE_ALLBASEFIX= 1<<23 /* do xml:base fixup for _all_ XINCLUDEs */
 } xmlParserOption;
 
 XMLPUBFUN void XMLCALL
diff --git a/parser.c b/parser.c
index ee429f3..1acc5a2 100644
--- a/parser.c
+++ b/parser.c
@@ -15111,6 +15111,14 @@ xmlCtxtUseOptionsInternal(xmlParserCtxtPtr ctxt, int 
options, const char *encodi
        ctxt->options |= XML_PARSE_NOBASEFIX;
         options -= XML_PARSE_NOBASEFIX;
     }
+    if (options & XML_PARSE_ALLBASEFIX) {
+       /* 
+        * There is no check for NOBASEFIX vs ALLBASEFIX.
+        * NOBASEFIX will override ALLBASEFIX.
+        */
+       ctxt->options |= XML_PARSE_ALLBASEFIX;
+        options -= XML_PARSE_ALLBASEFIX;
+    }
     if (options & XML_PARSE_HUGE) {
        ctxt->options |= XML_PARSE_HUGE;
         options -= XML_PARSE_HUGE;
diff --git a/xinclude.c b/xinclude.c
index 107ac03..e90c4ab 100644
--- a/xinclude.c
+++ b/xinclude.c
@@ -1685,7 +1685,7 @@ loaded:
 #endif
 
     /*
-     * Do the xml:base fixup if needed
+     * Do the xml:base fixup as needed
      */
     if ((doc != NULL) && (URL != NULL) &&
         (!(ctxt->parseFlags & XML_PARSE_NOBASEFIX)) &&
@@ -1695,27 +1695,41 @@ loaded:
        xmlChar *curBase;
 
        /*
-        * The base is only adjusted if "necessary", i.e. if the xinclude node
-        * has a base specified, or the URL is relative
+        * The xml:base is adjusted as necessary.  Possibly the
+        * xinclude node has a base specified?
         */
        base = xmlGetNsProp(ctxt->incTab[nr]->ref, BAD_CAST "base",
                        XML_XML_NAMESPACE);
        if (base == NULL) {
            /*
-            * No xml:base on the xinclude node, so we check whether the
-            * URI base is different than (relative to) the context base
+            * No xml:base on the xinclude node.  Compute the base
+            * from the URL of the included document, if possible
+            * relative to the context base.  See
+            * uri.c:xmlBuildRelativeURI for the relative/absolute
+            * magic.
             */
            curBase = xmlBuildRelativeURI(URL, ctxt->base);
            if (curBase == NULL) {      /* Error return */
                xmlXIncludeErr(ctxt, ctxt->incTab[nr]->ref,
                       XML_XINCLUDE_HREF_URI,
                       "trying to build relative URI from %s\n", URL);
+           } else if (((ctxt->parseFlags & XML_PARSE_ALLBASEFIX)) ||
+                      ((doc->parseFlags & XML_PARSE_ALLBASEFIX)) ||
+                      xmlStrchr(curBase, (xmlChar) '/')) {
+               base = curBase;
            } else {
-               /* If the URI doesn't contain a slash, it's not relative */
-               if (!xmlStrchr(curBase, (xmlChar) '/'))
-                   xmlFree(curBase);
-               else
-                   base = curBase;
+               /* 
+                * The XML_PARSE_ALLBASEFIX flag is unset, so we do
+                * minimal fixup, and don't modify xml:base if new
+                * base shares the path with the parent.  In that
+                * case, all URIs references within the included will
+                * lead to the same place, whether we fixup the
+                * xml:base or not.  However we drop file changes in
+                * the same path.  If you also need xml:base fixup for
+                * same path document changes, use
+                * XML_PARSE_ALLBASEFIX.
+                */
+               xmlFree(curBase);
            }
        }
        if (base != NULL) {     /* Adjustment may be needed */
diff --git a/xmllint.c b/xmllint.c
index 26d8db1..ba7f5b1 100644
--- a/xmllint.c
+++ b/xmllint.c
@@ -3053,6 +3053,7 @@ static void usage(const char *name) {
     printf("\t--xinclude : do XInclude processing\n");
     printf("\t--noxincludenode : same but do not generate XInclude nodes\n");
     printf("\t--nofixup-base-uris : do not fixup xml:base uris\n");
+    printf("\t--fixup-all-base-uris : fixup xml:base for same path, new 
document XInclude, too\n");
 #endif
     printf("\t--loaddtd : fetch external DTD\n");
     printf("\t--dtdattr : loaddtd + populate the tree with inherited 
attributes \n");
@@ -3280,6 +3281,13 @@ main(int argc, char **argv) {
            options |= XML_PARSE_XINCLUDE;
            options |= XML_PARSE_NOBASEFIX;
        }
+       else if ((!strcmp(argv[i], "-fixup-all-base-uris")) ||
+                (!strcmp(argv[i], "--fixup-all-base-uris"))) {
+           xinclude++;
+           options |= XML_PARSE_XINCLUDE;
+           options |= XML_PARSE_ALLBASEFIX;
+           options ^= XML_PARSE_NOBASEFIX & options;
+       }
 #endif
 #ifdef LIBXML_OUTPUT_ENABLED
 #ifdef HAVE_ZLIB_H

commit 44bd5c1a52b632502d2d9cd42255c19563d2a459
Author: Susanne Oberhauser <f...@suse.com>
Date:   Wed Apr 23 13:17:12 2014 +0000

        Remove premature check on URI being relative (gives false negatives).
        This is Alexey Neumann's first fix to xml:base handling

diff --git a/xinclude.c b/xinclude.c
index ace005b..107ac03 100644
--- a/xinclude.c
+++ b/xinclude.c
@@ -1687,7 +1687,7 @@ loaded:
     /*
      * Do the xml:base fixup if needed
      */
-    if ((doc != NULL) && (URL != NULL) && (xmlStrchr(URL, (xmlChar) '/')) &&
+    if ((doc != NULL) && (URL != NULL) &&
         (!(ctxt->parseFlags & XML_PARSE_NOBASEFIX)) &&
        (!(doc->parseFlags & XML_PARSE_NOBASEFIX))) {
        xmlNodePtr node;

-- 
Susanne Oberhauser                     SUSE LINUX Products GmbH
+49-911-74053-574                      Maxfeldstraße 5
Processes and Infrastructure           90409 Nürnberg
GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg)
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml@gnome.org
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to