On 04.10.2016 at 20:14, David Walker wrote:

> A couple weeks back I took a look at 72811[1].  The bug being that
> parse_url() didn't accept IPv6 addresses without a scheme, like it did for
> IPv4 addresses.  I attempted to patch the specific bug within the scope of
> how parse_url() was processing URI's.  After opening a PR for the
> resoution, Yasuo and Christoph both chimed in that perhaps replacing the
> implementation with an re2c based parser would be better.  We found a
> parser[2] that did almost everything necessary.  I took it and made it more
> strictly adhere to RFC3986[3].
> 
> I have updated my original PR[4] and created a RFC[5] that aims to replace
> the parsing of parse_url() to be more strict to RFC3986.  This will provide
> a BC break, as explained in the RFC that at very least warrants some
> discussion.  We had kicked around the idea on the PR of deprecating
> parse_url, and creating a new function with the more-compliant parser, but
> oped against it.
> 
> I'm looking for discussion on if a total replacement is the preferred way
> to go about this, and if, we should be making parse_url() more standards
> strict.  Since it today has many breaks with RFC3986 that provide
> semi-reasonable parsing patterns.
> 
> [1] - https://bugs.php.net/bug.php?id=72811
> [2] - https://github.com/staskobzar/url_parser_re2c
> [3] - https://tools.ietf.org/html/rfc3986
> [4] - https://github.com/php/php-src/pull/2079
> [5] - https://wiki.php.net/rfc/replace_parse_url

Thanks for the RFC, Dave!

I'm all for having a properly implementable URI parser that exactly
follows a specific standard.  However, I don't think we can replace
parse_url() with such a parser for BC reasons before PHP 8 (at least).
The parse_url() man page explicitly states:

| Partial URLs are also accepted, parse_url() tries its best to parse
| them correctly.

I'm quite sure that a lot of code relies on this behavior.

So, I basically see two options:

  * wait until PHP 8 (whenever that'll be released) and switch the
    implementation of parse_url() then – what might delay the adoption
    of PHP 8

  * add a new function in PHP 7.2 (maybe called parse_uri()), and
    perhaps deprecate parse_url() at the same time

-- 
Christoph M. Becker

-- 
PHP Internals - PHP Runtime Development Mailing List
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to