compat.c fix for #32001

Rob Richards Thu, 17 Feb 2005 06:44:15 -0800

Right, as long as you explicity include the encoding. So with the patch if you include a BOM, it wont parse (previous behavior worked fine) and if you include a prolog without explicit encoding it will always use UTF-8 (previous behavior was to autodetect encoding based on the charset used for first few chars of the prolog).

Would it be possible to ifdef it then and for older libxml (only needed when trying to use autoencoding) see if its possible to see if xmlDetectEndocding can be used prior to sending off to parser and if it returns no encoding then set the charset to utf-8 to avoid the infinite loop? This would preserver BC for anyone using prolog with no explicit encoding or BOM.

Rob

Joe Orton wrote:

On Thu, Feb 17, 2005 at 07:19:46AM -0500, Rob Richards wrote:

It looks like there would be BC breaks unless libxml with the bug fix is used as the encoding is detected properly and no infinite loop if an xml declaration or BOM is used in the xml. So basically with the patch there is no more autodetecting if used with any other libxml versions (though no more possibilities of inifinte loops).
That's not quite right: detection based on an ASCII <?xml prolog with an
explicit encoding= still works fine with the patch applied (e.g. for
encoding=ISO-8859-1 documents).  It's *only* documents which have a BOM
which will then fail to parse.
So it is a bit of a tricky trade-off...


--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] [PATCH] ext/xml/compat.c fix for #32001

Reply via email to