So with the patch if you include a BOM, it wont parse (previous behavior worked fine) and if you include a prolog without explicit encoding it will always use UTF-8 (previous behavior was to autodetect encoding based on the charset used for first few chars of the prolog).
Would it be possible to ifdef it then and for older libxml (only needed when trying to use autoencoding) see if its possible to see if xmlDetectEndocding can be used prior to sending off to parser and if it returns no encoding then set the charset to utf-8 to avoid the infinite loop? This would preserver BC for anyone using prolog with no explicit encoding or BOM.
Rob
Joe Orton wrote:
On Thu, Feb 17, 2005 at 07:19:46AM -0500, Rob Richards wrote:
It looks like there would be BC breaks unless libxml with the bug fix is used as the encoding is detected properly and no infinite loop if an xml declaration or BOM is used in the xml. So basically with the patch there is no more autodetecting if used with any other libxml versions (though no more possibilities of inifinte loops).
That's not quite right: detection based on an ASCII <?xml prolog with an explicit encoding= still works fine with the patch applied (e.g. for encoding=ISO-8859-1 documents). It's *only* documents which have a BOM which will then fail to parse.
So it is a bit of a tricky trade-off...
-- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php