On Fri, Dec 16, 2022 at 01:44:15AM -0500, John Lauro wrote: > What exactly is needed to reproduce the poor performance issue with openssl > 3? I was able to test 20k req/sec with it using k6 to simulate 16k users > over a wan. The k6 box did have openssl1. Probably could have sustained > more, but that's all I need right now. Openssl v1 tested a little faster, > but within 10%. Wasn't trying to max out my tests as that should be over > 4x the needed performance.
It mainly depends on the number of CPU cores. What's happening is that in 1.1.0 they silently removed the support for the locking callbacks (these are now ignored) and switched to pthread_mutex instead, without realizing that in case of contention, syscalls would be emitted. Using syscalls for tiny operations is already not good, but it got even worse in the post-SPECTRE era. And in 3.0 they made lots of stuff much more dynamic, with locks everywhere. I measured about 80 lock/unlock sequences for a single request! The problem is that once the load becomes sufficient for threads to compete on a lock, one of them goes into the system and sleeps there. And that's when you start seeing native_queued_spin_lock_slowpath() eat all your CPU. Worse, the time wasted sleeping in the system is so huge compared to the tiny operations that the lock aimed at protecting against, that this time is definitely lost and the system can never recover from this loss because work continues to accumulate. So you can observe good performance until it's too high, at which point you have to significantly lower it to recover. The worst I've seen was the client mode with performance going down from 74k cps to 400 cps on a 24-core machine, i.e. performance divided by almost 200! > Not doing H3, and the backends are send-proxy-v2. > Default libs on Alma linux on arm. > # rpm -qa | grep openssl > openssl-pkcs11-0.4.11-7.el9.aarch64 > xmlsec1-openssl-1.2.29-9.el9.aarch64 > openssl-libs-3.0.1-43.el9_0.aarch64 > openssl-3.0.1-43.el9_0.aarch64 > openssl-devel-3.0.1-43.el9_0.aarch64 > > This is the first box I setup with EL9 and thus openssl-3. Might it only > be an issue when ssl is used to the backends? That's where it has the highest effect, sadly, mostly with renegotiation. If you intend to run at less than a few thousands connection per second it could possibly be OK. Emeric collected some numbers, and we'll soon post them (but bear with us, it takes time to aggregate everything). Also, I don't know if you're using HTTP on the backends, but if so, you should normally mostly benefit from keep-alive and connection reuse. If you want to reproduce these issues, make sure you disable http-reuse (http-reuse never), and disable session resumption on the "server" lines ("no-ssl-reuse"). And never forget to run "perf top" on the machine to see where the CPU is spent. Willy