Re: [PHP-DEV] ENT_ALL or similar option for htmlspecialchars[_decode]?

Gustavo Lopes Thu, 27 Jun 2013 22:22:45 -0700

Em 2013-06-28 4:10, Kris Craig escreveu:

On Thu, Jun 27, 2013 at 6:43 PM, Yasuo Ohgaki <yohg...@ohgaki.net>wrote:
2013/6/27 Kris Craig <kris.cr...@gmail.com>
Yeah I tried html_entity_decode already, but it just returned NULL.Onthe same input string, htmlspecialchars_decode returned the inputstringbut with *some* special characters decoded; 10 and 13 ("\r\n", Ithink)were left in their encoded state. I'm not sure why there wouldn'tbe an
option to decode all html special characters.

You are missing the design purpose of htmlspecialchars_decode andhtml_entity_decode. Thruth is, they are not useful as they might seem.Their purpose is not to decode all the entities, like a browser woulddo. We do not implement anything approaching the sort parsing a browserwould do; for instance, html 5 says you should accept certain entitiesnot terminated with ; and parse the stream in a certain way and we don'tdo it at all. The purpose of those two functions is just to providesomething approaching an inverse function for htmlspecialchars() andhtmlentities(). html_entity_decode() has somewhat deviated from this(for instance, it decodes all numeric entites), but I think this shouldnevertheless be the proper way one should think about those twofunctions.

Not only HTML entities, we really needs to add severaldecoder/encoder to
core.
For instance, Javascript \uXXXX, HTML &#XX/&#XXXX, etc.
I hope someone is working on it :)


Would you be interested in co-authoring an RFC with me for this?

See http://php.net/manual/en/transliterator.transliterate.php For HTMLentities, out of the box, only a transliterator for numeric entities isprovided (hex-any/XML10), but you can easily build your ruleset for thenamed entities. The performance will be below of that of a dedicatedalgorithm, though. And it only supports UTF-8.


--
Gustavo Lopes

--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Re: [PHP-DEV] ENT_ALL or similar option for htmlspecialchars[_decode]?

Reply via email to