On Mon, 27 Jun 2022 at 22:32, Pierrick Charron <pierr...@adoy.net> wrote:

> That's just an example with an old version of PHP, but let's say you have
> some code that makes requests but only to a specific list of servers, so
> you want to analyze the URL and check if the host is in a whitelist. If the
> provided URL is "http://127.0.0.1:11211#@google.com:80/"; and that you
> used PHP <= 7.0.13 your parse_url function would tell you that the domain
> you're trying to request is google.com so everything is fine, but in fact
> when the call to curl is made, curl would call 127.0.0.1. This one was
> fixed but the problem could still occur if the parser is not the same as
> the one used in the requester.
>

...


> You can use CurlUrl within your implementation of UriInterface but for the
> same reason if you're using another request engine than curl, you may have
> the same security problem where curl will not parse the same data. If you
> want to make sure that your CurlUrl object represents the same thing as
> your UriInterface you could build the CurlUrl object part by part using
> your UriInterface. When you assign your CurlUrl to your CurlHandle with the
> CURLOPT_CURLU option, curl will use the parts directly instead of parsing
> the URL again, so you're sure that the host will be the one you set with
> `CurlUrl::setHost()` and so on.
>


This is actually a lot trickier than it sounds.

Imagine this code, with the bug you gave as an example still present in one
of the libraries we're interacting with:

$url = new MyUrlObject("http://127.0.0.1:11211#@google.com:80/";);
var_dump($url->getHostName()); // ???

This won't work, because we don't know which parser to call; it needs to be
something like this:

var_dump($url->getHostName(UrlContext::CURL)); // '127.0.0.1'
var_dump($url->getHostName(UrlContext::BROKEN_PHP)); // 'google.com'

Then we call this:

$url->setHostName("duckduckgo.com");
var_dump($url->getFullUrl(); // ???

Again, the result depends on context:

var_dump($url->getFullUrl( UrlContext::CURL )); // "
http://duckduckgo.com#@google.com:80/";
var_dump($url->getFullUrl( UrlContext::BROKEN_PHP )); // "
http://127.0.0.1:11211#@duckduckgo.com:80/";

But that means we can't actually apply the setHostName change until the
call to getFullUrl(), because we don't know how the original URL will be
parsed. Instead, the object has to internally store a queue of applied
modifications, and then reproduce them, in order, on the underlying
implementation.

Alternatively, we can build our implementation to assume it will always be
used in the context of curl, or for debugging curl, so can just have
getHostName() and setHostName() directly use curl's implementation. Which
leads us back to where we started, of using CurlUrl directly or via a thin
wrapper, as either a URL parser+builder, or as a value object in its own
right.

I don't really know how to make all this nuance clear to users, but that
makes me a bit wary of adding the object to PHP in its current design.

Regards,
-- 
Rowan Tommins
[IMSoP]

Reply via email to