Daniel,

This is obviously a "big" question whose answer likely take months to really determine. But we can get started :)

On 11/27/23 08:59, Daniel Andres Pelaez Lopez wrote:
We are facing some challenges with performance tunning for embedded
Tomcat using Spring Boot 3 (Tomcat version 10.1.7) and we would like
to ask for advice. The following is an overview of how our workload
looks like:
- The client is a CDN distributed around the world
- Tomcat serves files and media for video streaming, around hundreds
of kilobytes by media file
- The files and media are in memory (most of the time)
- The CND opens a lot of keep alive HTTPs connections, we have seen up to 25000
- There is no proxy or similar in front of Tomcat. Tomcat is handling
the HTTPs connection directly
- We have only one instance of Tomcat running.
- We are avoiding to scale Tomcat horizontally, as it is pretty hard
for our domain problem
- We can scale Tomcat up, today in some cases we are using an EC2 with
64 cores and 62 GiB memory. We can scale up more if we must, but
better if we can downscale instead.
- The EC2 is shared with other processes, like transcoders. This is to
decrease the latency as much as we can between the components of the
solution
- We have virtual threads active in Tomcat
- We have seen up to 2000 requests/second for light files (less than
10 kilobytes), and 500 requests/second for bigger files.
- Spike requests happen in a short time, from 100 requests/second to
1700 requests/second in 2 minutes.
>
We have seen the server eating 75 % of CPU, so, we want to optimize as
much as we can Tomcat to downscale the machine.

Thank you for the summary.

We have researched and we found some possible points to check:
- Should we use NIO or NIO2 connectors? I didn't find an answer for
this, we are using NIO. Maybe NIO2 handles better a lot of keep alive
connections?

NIO vs NIO2 shouldn't matter much. If it were me, I'd stick with NIO since it gets /much/ more usage than NIO2 and most of the issues in NIO that NIO2 was supposed to resolve have actually been fixed in NIO itself retroactively.

- Should we use tcnative to improve the performance for SSL? We are
concerned about virtual threads and possible pinning here, as this
might use JNI

If you require TLS, then tcnative is definitely an option you will want to consider. In most of our tests, OpenSSL outperforms JSSE's cryptographic implementation significantly (something like 2x improvement with OpenSSL).

Your use of Virtual Threads might complicate things, here, but the good news is that I/O through JNI -- which would pin a Virtual Thread to a Platform Thread -- should be "fast". It seems that your VM is mostly dedicated to pushing bytes around, anyway, so maybe letting it use the CPU to push them around isn't so bad.

Only testing will tell you whether this is a "good idea" or a "bad idea". I suspect it will be a little of both for you.

- Should we put a nginx or similar server in front of Tomcat to handle
SSL? we are avoiding this for latency reasons, and also, nginx will
add up to the other processes we have in the same machine

Using Tomcat with JSSE+OpenSSL or even APR+OpenSSL is essentially the same as using Apache httpd. I don't have enough experience with nginx to know if it's much different, but I suspect not. The time "wasted" re-interpreting everything -- not just TLS but also HTTP itself -- will likely lose any gains you get by adding them to the mix.

Now, if Apache httpd, Nginx, etc. can get you *caching* as well as TLS termination, etc. then maybe it's worth it. But my guess is that the CDN itself is supposed to be the primary cache in this equation.

- Should we increase maxKeepAliveRequests? We don't understand how
this work entirely, is this the max of requests by one keep alive
connection? parallel requests or sequential? seems like the default is
100, and probably we should increase it as the CND might not open more
connections if he can send more requests in previous ones.

This might be a good idea, depending upon how much "traffic" each of your persistent connections actually gets. If you find that KeepAlive connections are being "wasted" than you might want to limit the total number of requests each connection will allow. My guess is that you probably want to re-use the connections from the CDN for as long as you possibly can.

You will want to use NIO connectors here and specifically /not/ APR if you are going to use tcnative/OpenSSL because APR-keep-alive is a *blocking* operation which will kill your threads.

- Should we increase socket.txBufSize? seems like we should, as we are
sending media files, having a bigger buffer makes sense
- Should we use direct buffers socket.directSslBuffer?
- Should we increase the socket.appWriteBufSize?

These settings are detailed-enough that I'm not a good person to answer those questions. I suspect that increasing the socket transmission buffer might help, but it all comes down to how much data ends up being encapsulated in the TCP/IP packets and not really the local transmit buffer itself -- though they may be very closely-related.

We are exploring JVM performance options also, but any help regarding
Tomcat will be appreciated.

I'm curious: did you experience any significant change in performing and/or CPU usage when switching to Virtual Threads?

-chris

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to