On Apr 6, 2010, at 10:54 AM, Rasmus Lerdorf wrote: > On 04/06/2010 10:47 AM, Scott MacVicar wrote: >> On Apr 6, 2010, at 10:34 AM, Rasmus Lerdorf wrote: >> >>> On 04/06/2010 10:08 AM, Justin Dearing wrote: >>>> So pending review an acceptance by Dmitry, I've written my first patch for >>>> PHP. While there is a good chance I will need to make further revisions to >>>> the test or code, I don't know what that is. >>>> >>>> However, I've got some free time at the moment, and I'd like to make use >>>> of >>>> some of the sunk costs of figuring out how to hack PHP. So I know that in >>>> general there is a lot of work to be done. I also know that there are >>>> plenty of open bugs, tests to be written, etc etc. What I am looking for is >>>> someone to say is "here are the next 10 bugs I will work on can you write >>>> me >>>> test" or "I wrote this patch on linux, I need someone to make it work on >>>> windows too" or, "Party X complains of this but refuses to fill out a >>>> proper >>>> bug report." >>> >>> Here is a straightforward (but not easy) one: >>> >>> http://bugs.php.net/bug.php?id=47435 >>> >>> The php_filter_validate_ip() function in ext/filter/logical_filters.c >>> needs those reserved IPV6 ranges added to the FORMAT_IPV6 case in the >>> switch statement there when FILTER_FLAG_NO_RES_RANGE is set. I say it >>> isn't super easy because we don't have much in the way of ipv6 parsing >>> in PHP yet, so it will probably involve finding some decent code that >>> can expand an ipv6 notation into something we can logically separate. >>> That might also mean a rewrite of the _php_filter_validate_ipv6() >>> function in the same file. >>> >>> Another one, if you are interested in encoding issues: >>> >>> http://bugs.php.net/bug.php?id=49687 >>> >>> I don't necessarily agree with Scott that it is wrong to expect >>> addslashes() to validate the input string. It could call >>> get_next_char() the same way php_escape_html_entities_ex() in >>> ext/standard/html.c does. And we need that utf8_decode() fix mentioned >>> in the report reviewed/committed if it hasn't been already. >>> >> >> I fixed utf8_decode and I had a patch for adding utf8_validate which is >> probably suitable for 5.4. >> >> http://whisky.macvicar.net/patches/utf8-string.diff.txt >> >> It's not quite done, I had intentions of adding support for using truncate, >> simple true / false for valid or the unicode replacement character. > > My only issue with this is that it essentially duplicates the utf8 part > of get_next_char() from html.c. I'd like to see cs parsing in one place > instead of spread out all over the code tree. The get_next_char() > function also supports other charsets, so we could have a more generic > cs_validate() function along with utf8_validate(). >
I missed this function last year, abstracting that and making it PHPAPI would be awesome. Scott -- PHP Internals - PHP Runtime Development Mailing List To unsubscribe, visit: http://www.php.net/unsub.php