On Wed, Dec 14, 2022 at 07:01:59AM -0700, Shawn Heisey wrote: > On 12/14/22 06:07, Willy Tarreau wrote: > > By the way, are you running with OpenSSL > > 3.0 ? That one is absolutely terrible and makes extreme abuse of > > mutexes and locks, to the point that certain workloads were divided > > by 2-digit numbers between 1.1.1 and 3.0. It took me one day to > > figure that my load generator which was caping at 400 conn/s was in > > fact suffering from an accidental build using 3.0 while in 1.1.1 > > the perf went back to 75000/s! > > Is this a current problem with the latest openssl built from source?
Yes and deeper than that actually, there's even a meta-issue to try to reference the many reports for massive performance regressions on the project: https://github.com/openssl/openssl/issues/17627#issuecomment-1060123659 > I'm > running my 2.7.x installs with quictls 3.0.7, which aside from the QUIC > support should be the same as openssl. Due to new distros progressively moving to 3.0, it's getting more and more exposed. And with 1.1.1 support ending soon, it's going to become a huge problem for many high-performance users. > 400 connections per second is a lot more than I need, but if it's that > inefficient, seems like overall system performance would take a hit even if > it's not completely saturated. My primary server has dual E5-2697 v2 CPUs, > but my mail server is a 2-CPU AWS instance. Actually you're in the same situation as plenty of users who don't need this level of performance and will not necessarily notice the problem until they face a traffic spike and the machine collapses. > Should I switch to quictls 1.1.1 instead? Possibly, yes. It's more efficient in every way from what we can see. For users who build themselves (and with QUIC right now you don't have a better choice), it should not change anything and will keep robustness. For those relying on the distro's package, I don't know if it's possible to install the previous distro's package side-by-side, but in any case it can start to become a mess to deal with. But if you're running at low loads and ideally not exposed to the net, it's unlikely that you'd notice it. What's really happening is that in order to make it more dynamic they've apparently replaced lots of constants with functions that run over lists under locks, so if you're facing very low load, the overhead will remain minimal, but once the load increases and multiple threads need to access the same elements, contention happens. To give you an idea, during a test I measured up to 80 calls to a rwlock for a single HTTP request... Mutexes are so expensive that they should be avoided by all means in low-level functions, and in the worst case should be limited to a single-digit. Here it has no chance to ever recover once a short traffic spike touches the machine. Willy