Hi, On Sep 19, 2014 4:03 PM, "Chris Wright" <c...@daverandom.com> wrote: > > Kévin > > On 18 September 2014 21:26, Kévin Dunglas <dung...@gmail.com> wrote: > > Hello, > > > > I'm working on enhancing the FILTER_VALIDATE_URL filter ( > > https://github.com/php/php-src/pull/826). > > The current implementation does not support validation of internationalized > > domain names (i.e: http://www.académie-française.fr/ <http://www.xn--acadmie-franaise-npb1a.fr/> > > <http://www.xn--acadmie-franaise-npb1a.fr/>). > > > > Support of IDN validation can be easily added using ICU's uidna_toASCII() > > function. > > > > Is it acceptable to add a dependency to ICU for ext/filter? > > Another option is to add a HAVE_ICU constant in main/php_config.h and to > > validate IDN only if ICU is present. > > > > What strategy is preferred? > > I've done some work around this area previously, and all I will say > is: be careful with what you do with this from a userland PoV. > > PHP does not natively support IDN in stream open routines or SSL > verification routines. It will never support these things without at > least one of: > - a core dependency on ICU, libidn or similar > - moving streams into an extension so a dependency can be introduced > there (probably not sanely possible) > - an in-house NAMEPREP implementation (this is the hard part of IDN, > punycode itself is pretty trivial to implement once you have a > canonical set of codepoints) > > These things can be implemented with *a lot* of boilerplate in > userland when you have ext/intl, but it's not pretty. libcurl *can* > support IDN if it was built against libidn, I'm not sure if this is > currently the case in common distributions or not. Since one almost > never just validates a URL string, it's usually a precursor to > attempting to open it, this could lead to some pretty hefty wtfs. > > In short, while I'm generally for ext/filter being able to handle IDN, > I *do not* believe it should do it implicitly, it should require an > explicit flag, because it will break *a lot* of code if IDN is > suddenly treated as valid where it previously wasn't.
I am really not sure about that especially the enabling by default part. The doc is pretty clear about what this filter supports and allowing idn may break a lot of codes out there. >From an implementation point of view we may not need ICU to support IDN. Windows does not use it and there are license friendly decoder implementations too. Cheers, Pierre