Hi

Am 2024-10-25 16:29, schrieb Eric Norris:
I'm especially concerned, because the documentation for
`curl_share_init()` uses `CURL_LOCK_DATA_COOKIE` as the example. I would
also assume that sharing a cookie jar amongst several requests is the
primary use case for leveraging a curl share handle, given that curl
already performs in-request connection reuse when reusing a single
CurlHandle or when performing requests with a CurlMultiHandle.

I don't believe that sharing a cookie jar is the primary use case; it
certainly wouldn't be for us. Our primary use case is DNS and
connection sharing, for the aforementioned fanout.

FWIW, the curl documentation also lists cookie sharing as the first thing when describing the "share interface": https://curl.se/libcurl/c/libcurl-share.html.

Accidentally sharing a cookie jar for unrelated requests due to a badly
chosen `$persistent_id` sounds like a vulnerability to is bound to
happen to someone.

I'll admit that I don't have a good response to this, since while I
agree this is possible, I don't think it outweighs the benefit that
persistence gives to the applications that need it (again, we won't be
using the cookie jar). I see you've noted concerns about
documentation, would this be something you'd at least feel more
comfortable with if curl persistence documentation included a strong
warning?

Yes, I would be *more* comfortable if the documentation would ensure to appropriately the possible issues with persisting the share handle across multiple requests, possibly giving suggestions as to how a good persistent ID could be constructed (e.g. by using a hash of the current security context or something like that).

I would also be more comfortable, if persisting a share handle with cookie jar sharing enabled would not be possible / would be ignored, because that only leaves the TLS session sharing (see below) as a possible safety issue.

Interestingly, one of the reasons that I began this work was that we
found a measurable performance impact of talking to the caching
resolver on localhost, let alone the connection overhead. Essentially

That honestly surprises me.

As for sharing a TLS or HTTP connection,
this might or might not lead to vulnerabilities similarly to cookie
sharing, when TLS client certificates are used for authentication.

I believe curl itself handles this - it won't re-use a connection if
the settings aren't the same.

Okay, the curl documentation does not say anything about this in https://curl.se/libcurl/c/CURLSHOPT_SHARE.html#CURLLOCKDATASSLSESSION.

re-raise it there. We do already have a curl_share_close method, so
the RFC would not need to add a new method, and since the method is
called curl_share_close, it's not really changing the meaning of the
method either.

That makes sense to me.

Also badly chosen `$persistent_id`s might result in a
large number of handles accumulating, without any kind of garbage
collection. For the existing persistent handles (e.g. for database
connections), the ID is chosen by PHP itself, ensuring a somewhat
predictable behavior.

I agree that this may be a risk, but it's not clear to me that this is
a risk that people would realistically run into, considering you'd
want to have a small number of persistent IDs in order to benefit from
re-using them. Dynamically picking your persistent IDs would lead to
less re-use.

I believe in misuse-resistant API design. Folks should be steered towards doing the right thing by default. As an example: What if a library internally choses to use persistent share handles for whatever reason? Or an add-on you install in your CMS software? PHP for better or worse is not just used for in-house software with an experienced development team.

that developers are reaching for. Frankly, I see very little reason to
use it without persistence, considering that curl_multi_init does the
same thing, and is simpler to use.

Yes. And that probably explains why you are not seeing much use of it until now. But if/when it gets “marketed” as the next-big-thing in performance by tech bloggers that don't understand the possible caveats, it will probably start to get used by folks that don't fully understand it.

Regarding the RFC text, I apologize for not filling it out more, but
it seemed straightforward to me: curl supports the share interface as
a primary feature, and there are no glaring warnings in the
documentation (https://curl.se/libcurl/c/libcurl-share.html)
indicating that you might hold the share interface incorrectly. As I

The difference to me is that when using libcurl directly, you would generally have a more explicit data flow and don't pull a curl share handle in an unknown state out of thin air. See the library example above.

After all the pros and cons would also need to be
documented in PHP’s documentation and the RFC text would be a good
source for the documentation team.

My plan was to help add the documentation (and tests) after this
passed; it was unclear to me that documentation may need to be sorted
out up-front in the RFC process. That said, I believe I can summarize

The actual documentation is generally written during the RC phase, but if the RFC author is no longer available when that happens, the documentation team needs something to work with.

the remaining cons from your email:

- A developer could reuse a cookie jar with cookies from user A in a
subsequent request from user B to origin Z if they opted in to
persistence, used CURL_LOCK_DATA_COOKIE, and used the same persistent
handle for both users.

Yes.

- A developer could balloon memory usage by using dynamic persistent
IDs, especially so if curl_share_close doesn't close them.

Yes.

If I were to add notes about these both to the documentation, and
curl_share_close closed persistent share handles, would you change
your vote for this or a future RFC?

I'm not sure about whether I consider documentation alone to be sufficient, but when share handles with cookie sharing enabled would not be persisted (and removed from persistence when enabling the option after the fact), then I would be fine with trusting libcurl to do the right thing. Ideally this could be confirmed with some reference to the libcurl source code or getting upstream confirmation that this is safe to do.

Best regards
Tim Düsterhus

Reply via email to