Hi Lari,

thanks for the proposal, though I don't think it would be good to
change this default value.

Pulsar HTTP API is an administrative service and it is not on the
critical data path.
If all the threads in the pool are busy, it is good to apply back
pressure and slow down the incoming new requests.

In the past, we had several cases of deadlocks involving HTTP Jetty
threads. In all these cases, the root cause has been a broker that in
response to an HTTP request is spawning HTTP requests to other
brokers, and some of these requests are going to itself.

These kinds of deadlock are unrelated to the number of HTTP threads in
the pool and must be dealt with by making the underlying operation
asynchronous.

Finally each user is free to change the default to something that
suits more, though I really don't think it makes for us to default to
200 threads as:
  1. It will not improve the Pulsar performance
  2. It will not magically solve any deadlock issue
  3. It will just consume much more resources for everyone


Replying in-line to some of the points.

>
> Additional context
>
> -   Examples of previous issues where Jetty threads have been occupied
>     and caused problems: #13666 #4756 #10619

All these examples are cases of what I have described above: an HTTP
handler triggering a call to the same HTTP server resulting in a
deadlock.

Having had more threads wouldn't have fixed them.

> -   Mailing list thread about “make async” changes:
>    https://lists.apache.org/thread/tn7rt59cd1k724l4ytfcmzx1w2sbtw7l

The rationale for these changes has not much to do with Jetty but
rather to make sure that we are not mixing blocking and non-blocking
operations when handling these requests, which is prone to cause
deadlocks in broker's internal threads (irrespective of any Jetty
thread pool size).

> Q: What’s the reason of setting the default value to 200? If the node just 
> have one core, what will happen?
>
> These are threads. Jetty defaults to 200 maximum threads, to prevent
> thread pool starvation. This is recommended when using blocking Servlet
> API. The problem is that Pulsar uses the blocking servlet API and
> doesn’t have a sufficient number of threads which are needed and
> recommended.

We need to keep in mind that typical usage of Jetty is as a Servlet
container, dedicated to that. You have servlets doing disk IO reads
and blocking calls on JDBC drivers.

Pulsar usage is *very* different from that. Admin API is a tiny
portion of what a broker does. Dedicating 200 threads to that would be
a massive waste of resources (CPU & memory).

> The value 200 doesn’t mean that there will be 200 threads to start with.
> This is the maximum size for the thread pool. When the value is more
> than 8, Jetty will start with 8 initial threads and add more threads to
> the pool when all threads are occupied.
>
> Q: Do we need to take the number of system cores into consideration for the 
> maximum threads of the thread pool?
>
> No. Jetty is different from Netty in this aspect. In Netty, everything
> should be asynchronous and “thou shall never block”. In Jetty, the
> maximum number of threads for the thread pool should be set to 50-500
> threads and blocking operations are fine.

I disagree. In a container configured with 0.5 CPU cores, it would be
very bad to run 200 threads, and it will have no other advantage
whatsoever.

That is the reason why the default thread pool size is pegged to the
number of cores.

> The recommendation for the thread pool is explained in Jetty
> documentation
> https://www.eclipse.org/jetty/documentation/jetty-9/index.html#_thread_pool
> > Thread Pool > Configure with goal of limiting memory usage maximum
> available. Typically this is >50 and <500

The recommendations for Jetty are relative to typical Jetty workloads,
something the Pulsar broker is definitely not.

Reply via email to