What about this patch? Really hacky as the charset is checked before the
inital parse and it basically duplicates the libxml code with the
correct fix, but seems to work ok. Havent tried it with any large
datasets yet which require multiple parse calls, but it should work.
Rob
Joe Orton wrote:
Right, as long as you explicity include the encoding.
So with the patch if you include a BOM, it wont parse (previous behavior
worked fine) and if you include a prolog without explicit encoding it
will always use UTF-8 (previous behavior was to autodetect encoding
based on the charset used for f
On 2005/02/17, at 22:28, Joe Orton wrote:
So it is a bit of a tricky trade-off...
How about #ifdef'ifying it? It's lame though...
Moriyoshi
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php
On Thu, Feb 17, 2005 at 07:19:46AM -0500, Rob Richards wrote:
> It looks like there would be BC breaks unless libxml with the bug fix is
> used as the encoding is detected properly and no infinite loop if an xml
> declaration or BOM is used in the xml. So basically with the patch there
> is no m
It looks like there would be BC breaks unless libxml with the bug fix is
used as the encoding is detected properly and no infinite loop if an xml
declaration or BOM is used in the xml. So basically with the patch there
is no more autodetecting if used with any other libxml versions (though
no m
On 17.2.2005 12:26 Uhr, Joe Orton wrote:
libxml2's charset encoding auto-detection mode is broken with the push
parser in current versions of libxml2, I found that recently:
http://bugzilla.gnome.org/show_bug.cgi?id=162613
but trying to force it can trigger infinite loops in libxml2, which is
what
libxml2's charset encoding auto-detection mode is broken with the push
parser in current versions of libxml2, I found that recently:
http://bugzilla.gnome.org/show_bug.cgi?id=162613
but trying to force it can trigger infinite loops in libxml2, which is
what happens in http://bugs.php.net/?id=3200