Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-29 Thread Yasuo Ohgaki
Hi, I've created RFC page so that this discussion will be forgotten. https://wiki.php.net/rfc/default_encoding Please edit the RFC page if needed. Regards, -- Yasuo Ohgaki yohg...@ohgaki.net -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/uns

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-26 Thread Yasuo Ohgaki
Hi, 2012/8/27 Stas Malyshev : > Hi! > >> In PHP 6 we tried to introduce separate input, script and output >> encoding settings. Currently in 5.4 we don't have that, but we have >> those 3 separately for mbstring and for iconv: >> >> iconv.input_encoding >> iconv.internal_encoding >> iconv.output_e

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-26 Thread Rasmus Lerdorf
On 08/26/2012 02:57 PM, Yasuo Ohgaki wrote: > Hi, > > 2012/8/27 Stas Malyshev : >> Hi! >> >>> In PHP 6 we tried to introduce separate input, script and output >>> encoding settings. Currently in 5.4 we don't have that, but we have >>> those 3 separately for mbstring and for iconv: >>> >>> iconv.in

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-26 Thread Stas Malyshev
Hi! > In PHP 6 we tried to introduce separate input, script and output > encoding settings. Currently in 5.4 we don't have that, but we have > those 3 separately for mbstring and for iconv: > > iconv.input_encoding > iconv.internal_encoding > iconv.output_encoding > mbstring.http_input > mbstring

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-25 Thread Rasmus Lerdorf
On 08/25/2012 12:59 PM, Ángel González wrote: > I see. Thank you very much. > Even worse, HTML5 doesn't seem to have any provision for that, as it works > with characters. A user agent would have to protect himself from this by > making > those kind of utf-8 characters a hard error instead of tryin

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-25 Thread Yasuo Ohgaki
Hi, 2012/8/26 Ángel González : > Even worse, HTML5 doesn't seem to have any provision for that, as it works > with characters. A user agent would have to protect himself from this by > making > those kind of utf-8 characters a hard error instead of trying to recover > from it. Right. I would like

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-25 Thread Ángel González
On 25/08/12 00:50, Rasmus Lerdorf wrote: > In 8859-1 no chars are invalid so anything that doesn't get encoded will > get passed through as-is. For example the byte 0xE0 is a perfectly valid > 8859-1 character (à), but if the page is actually UTF-8 then this > becomes the first byte of a 3-byte UTF

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-25 Thread Yasuo Ohgaki
Hi, I'm +1 for having internal/input/output/script encoding setting at PHP or Zend level. If the default is the problem is the problem, we should set default_charset default to UTF-8 and use them as default for internal/input/output/script and functions that affected by encoding. When XSS adviso

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-24 Thread Rasmus Lerdorf
On 08/24/2012 02:23 PM, Ángel González wrote: > El 23/08/12 18:06, Rasmus Lerdorf escribió: >> htmlspecialchars(), htmlentities(), html_entity_decode() and >> get_html_translation_table() all take an encoding parameter that used to >> default to iso-8859-1. We changed the default in PHP 5.4 to UTF-

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-24 Thread Ángel González
El 23/08/12 18:06, Rasmus Lerdorf escribió: > htmlspecialchars(), htmlentities(), html_entity_decode() and > get_html_translation_table() all take an encoding parameter that used to > default to iso-8859-1. We changed the default in PHP 5.4 to UTF-8. This > is a much more sensible default and in th

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-23 Thread Andrew Faulds
On 23/08/12 17:15, Rasmus Lerdorf wrote: On 08/23/2012 09:09 AM, Andrew Faulds wrote: Personally, I think you should have just two encodings: page_encoding and internal_encoding. The former is for form input and page output (could be latin-1, for instance), and internal_encoding is the internal

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-23 Thread Adam Jon Richardson
On Thu, Aug 23, 2012 at 12:06 PM, Rasmus Lerdorf wrote: > So do we create a new default_input_encoding ini directive mid-stream in > 5.4 for this? Of course with the longer-term in mind that this will be > part of a unified set of encoding settings in 5.5 and beyond. Yes! This is a fantastic idea

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-23 Thread Rasmus Lerdorf
On 08/23/2012 09:09 AM, Andrew Faulds wrote: > Personally, I think you should have just two encodings: page_encoding > and internal_encoding. The former is for form input and page output > (could be latin-1, for instance), and internal_encoding is the internal > representation (default to utf-8 - y

Re: [PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-23 Thread Andrew Faulds
On 23/08/12 17:06, Rasmus Lerdorf wrote: htmlspecialchars(), htmlentities(), html_entity_decode() and get_html_translation_table() all take an encoding parameter that used to default to iso-8859-1. We changed the default in PHP 5.4 to UTF-8. This is a much more sensible default and in the case of

[PHP-DEV] Default input encoding for htmlspecialchars/htmlentities

2012-08-23 Thread Rasmus Lerdorf
htmlspecialchars(), htmlentities(), html_entity_decode() and get_html_translation_table() all take an encoding parameter that used to default to iso-8859-1. We changed the default in PHP 5.4 to UTF-8. This is a much more sensible default and in the case of the encoding functions more secure as it p