Hi all, A couple weeks back I took a look at 72811[1]. The bug being that parse_url() didn't accept IPv6 addresses without a scheme, like it did for IPv4 addresses. I attempted to patch the specific bug within the scope of how parse_url() was processing URI's. After opening a PR for the resoution, Yasuo and Christoph both chimed in that perhaps replacing the implementation with an re2c based parser would be better. We found a parser[2] that did almost everything necessary. I took it and made it more strictly adhere to RFC3986[3].
I have updated my original PR[4] and created a RFC[5] that aims to replace the parsing of parse_url() to be more strict to RFC3986. This will provide a BC break, as explained in the RFC that at very least warrants some discussion. We had kicked around the idea on the PR of deprecating parse_url, and creating a new function with the more-compliant parser, but oped against it. I'm looking for discussion on if a total replacement is the preferred way to go about this, and if, we should be making parse_url() more standards strict. Since it today has many breaks with RFC3986 that provide semi-reasonable parsing patterns. -- Dave [1] - https://bugs.php.net/bug.php?id=72811 [2] - https://github.com/staskobzar/url_parser_re2c [3] - https://tools.ietf.org/html/rfc3986 [4] - https://github.com/php/php-src/pull/2079 [5] - https://wiki.php.net/rfc/replace_parse_url