After rolling out our Apache 4 -> 5 upgrade again (this time, with idle
connection validation configured), we got the following report from someone
who discovered an issue in load testing. The service he is testing has a
dependency that is called with high fanout (up to 30 times per request),
with requests made in parallel from a thread pool. During load testing, the
service totally locked up; even after stopping traffic for two hours, all
requests to the service were failing.

Two different types of exceptions were observed:

1. `CancellationException` on `BasicFuture.get()` when trying to lease a
connection from `PoolingHttpClientConnectionManager`
2. `java.lang.IllegalStateException: Endpoint not acquired / already
released` at
`com.amazon.coral.apache.hc.client5.http.impl.classic.InternalExecRuntime.ensureValid(InternalExecRuntime.java:142)`

I have a suspicion that this user is having some sort of resource
exhaustion issue (file descriptors? ephemeral ports?) which in turn is
triggering some failure mode in the client that causes the connection pool
to lock up. Thoughts?

Reply via email to