ID: 25707 User updated by: Bjorn dot Victor at it dot uu dot se -Summary: html_entity_decode over-decodes " Reported By: Bjorn dot Victor at it dot uu dot se Status: Bogus Bug Type: Strings related Operating System: Solaris 8 PHP Version: 4.3.3 New Comment:
Sorry, this is not an RTFM error, and has nothing to do with the optional parameters of the function. I have changed the summary to refer to "lt", to avoid confusion with ENT_QUOTES etc - believe me, I tried this before looking at the source and figuring out what the error really was. The current code works like this: iterate over the 6 "basic_entities", replace the entity with its character in the string. "&" is the first item in basic_entities, which is good when you're doing htmlentities (the reverse operation). Given a string "&lt;", it will first become "<", and then (because "<" is handled after "&"), "<". Consider doing "&" last, e.g. by traversing basic_entities backwards: "&lt;" becomes "<", which is the expected. Previous Comments: ------------------------------------------------------------------------ [2003-09-30 15:00:59] [EMAIL PROTECTED] RTFM: http://www.php.net/html_entity_decode (the 2nd optional parameter..) ------------------------------------------------------------------------ [2003-09-30 14:52:20] Bjorn dot Victor at it dot uu dot se Description: ------------ Symptom: html_entity_decode("&quot;") returns '"', while the expected value would be """. Corresponding (wrong) behaviour for & followed by "lt;", "gt;" etc. Another example is html_entity_decode(htmlentities("<")) which returns "<" rather than "<" as expected. As a result, html_entity_decode can not be used as the inverse of htmlentities. Diagnosis: The function (php_unescape_html_entities in ext/standard/html.c) replaces each entity in basic_entities with its corresponding character, but starts by replacing "&" with "&", the resulting string being """, which is then replaced by '"'. Solution: php_unescape_html_entities in ext/standard/html.c traverses the basic_entities from the wrong end; it must replace "&" *last*, not *first*. Reproduce code: --------------- print html_entity_decode("&quot;&lt;&gt;"); Expected result: ---------------- "<> Actual result: -------------- "<> ------------------------------------------------------------------------ -- Edit this bug report at http://bugs.php.net/?id=25707&edit=1