On 08/25/2012 12:59 PM, Ángel González wrote: > I see. Thank you very much. > Even worse, HTML5 doesn't seem to have any provision for that, as it works > with characters. A user agent would have to protect himself from this by > making > those kind of utf-8 characters a hard error instead of trying to recover > from it.
We essentially treat it as a hard error because these functions will return an empty string if they see any invalid chars. They won't try to fix them in any way. This is what people are complaining about, by the way, and in most cases they are actually sending stuff out in UTF-8 but they were relying on the html* functions passing everything through so while they look at it as a BC break, it is actually fixing a security problem in their applications. Now if they really are using iso-8859-1 as their input and output encodings, then yes, we have broken things on them and they will need to specify their charset and this is the case I was wondering if we could improve and make their lives easier by adding an default_input_encoding setting that these functions would use. -Rasmus -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php