On 04.10.2016 at 20:14, David Walker wrote:
> A couple weeks back I took a look at 72811[1]. The bug being that
> parse_url() didn't accept IPv6 addresses without a scheme, like it did for
> IPv4 addresses. I attempted to patch the specific bug within the scope of
> how parse_url() was processing URI's. After opening a PR for the
> resoution, Yasuo and Christoph both chimed in that perhaps replacing the
> implementation with an re2c based parser would be better. We found a
> parser[2] that did almost everything necessary. I took it and made it more
> strictly adhere to RFC3986[3].
>
> I have updated my original PR[4] and created a RFC[5] that aims to replace
> the parsing of parse_url() to be more strict to RFC3986. This will provide
> a BC break, as explained in the RFC that at very least warrants some
> discussion. We had kicked around the idea on the PR of deprecating
> parse_url, and creating a new function with the more-compliant parser, but
> oped against it.
>
> I'm looking for discussion on if a total replacement is the preferred way
> to go about this, and if, we should be making parse_url() more standards
> strict. Since it today has many breaks with RFC3986 that provide
> semi-reasonable parsing patterns.
>
> [1] - https://bugs.php.net/bug.php?id=72811
> [2] - https://github.com/staskobzar/url_parser_re2c
> [3] - https://tools.ietf.org/html/rfc3986
> [4] - https://github.com/php/php-src/pull/2079
> [5] - https://wiki.php.net/rfc/replace_parse_url
Thanks for the RFC, Dave!
I'm all for having a properly implementable URI parser that exactly
follows a specific standard. However, I don't think we can replace
parse_url() with such a parser for BC reasons before PHP 8 (at least).
The parse_url() man page explicitly states:
| Partial URLs are also accepted, parse_url() tries its best to parse
| them correctly.
I'm quite sure that a lot of code relies on this behavior.
So, I basically see two options:
* wait until PHP 8 (whenever that'll be released) and switch the
implementation of parse_url() then – what might delay the adoption
of PHP 8
* add a new function in PHP 7.2 (maybe called parse_uri()), and
perhaps deprecate parse_url() at the same time
--
Christoph M. Becker
--
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php