On 17.2.2005 12:26 Uhr, Joe Orton wrote:
libxml2's charset encoding auto-detection mode is broken with the push
parser in current versions of libxml2, I found that recently:

http://bugzilla.gnome.org/show_bug.cgi?id=162613

but trying to force it can trigger infinite loops in libxml2, which is
what happens in http://bugs.php.net/?id=32001

So I think it's best to not force this mode.  Future versions of libxml2
will set parser->charset to XML_CHAR_ENCODING_NONE by default with the
push parser and will hence work as desired with no explicit setting of
parser->charset required.

Any BC breaks with that? Do I have to know now the encoding of the XML document, before I can use the push parser? But reading your bugreport at gnome.org, I assume it just defaults to UTF-8, right?


chregu



Is this patch OK?

http://www.apache.org/~jorton/php_xmlenc.diff

Index: ext/xml/compat.c
===================================================================
RCS file: /repository/php-src/ext/xml/compat.c,v
retrieving revision 1.32.2.7
diff -u -r1.32.2.7 compat.c
--- ext/xml/compat.c    17 Dec 2004 12:21:34 -0000      1.32.2.7
+++ ext/xml/compat.c    17 Feb 2005 11:12:08 -0000
@@ -379,8 +379,6 @@
        }
        if (encoding != NULL) {
                parser->parser->encoding = xmlStrdup(encoding);
-       } else {
-               parser->parser->charset = XML_CHAR_ENCODING_NONE;
        }
        parser->parser->replaceEntities = 1;
        parser->parser->wellFormed = 0;
Index: ext/xml/tests/bug32001.phpt
===================================================================
RCS file: ext/xml/tests/bug32001.phpt
diff -N ext/xml/tests/bug32001.phpt
--- /dev/null   1 Jan 1970 00:00:00 -0000
+++ ext/xml/tests/bug32001.phpt 17 Feb 2005 11:12:08 -0000
@@ -0,0 +1,40 @@
+--TEST--
+Bug #32001 (infinite loop in libxml character encoding detection)
+--FILE--
+<?php
+$myparser = xml_parser_create('');
+$simple = "<para><note>simple note</note></para>";
+xml_parse_into_struct($myparser, $simple, $myvals, $mytags);
+var_dump($myvals);
+--EXPECT--
+array(3) {
+  [0]=>
+  array(3) {
+    ["tag"]=>
+    string(4) "PARA"
+    ["type"]=>
+    string(4) "open"
+    ["level"]=>
+    int(1)
+  }
+  [1]=>
+  array(4) {
+    ["tag"]=>
+    string(4) "NOTE"
+    ["type"]=>
+    string(8) "complete"
+    ["level"]=>
+    int(2)
+    ["value"]=>
+    string(11) "simple note"
+  }
+  [2]=>
+  array(3) {
+    ["tag"]=>
+    string(4) "PARA"
+    ["type"]=>
+    string(5) "close"
+    ["level"]=>
+    int(1)
+  }
+}


-- christian stocker | Bitflux GmbH | schoeneggstrasse 5 | ch-8004 zurich phone +41 1 240 56 70 | mobile +41 76 561 88 60 | fax +41 1 240 56 71 http://www.bitflux.ch | [EMAIL PROTECTED] | gnupg-keyid 0x5CE1DECB

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php



Reply via email to