Does QAT not provide a way to detect what the hardware supports and return that capability at construction time so we can pick the fastest implementation that the hardware supports? That seems like a more robust way than inline exception handling and consistent with how we do the other native fallbacks, where we probe if they are available and fallback to a different instance entirely if not available.
Agree the inline try-catch is inelegant and implies that sometimes QAT can succeed and sometimes it can fail. That should not be the case (either hardware acceleration exists or not). -Joey On Wed, Dec 17, 2025 at 9:31 AM Štefan Miklošovič <[email protected]> wrote: > To be explicit, we are talking about this kind of fallbacking: > > > https://gist.githubusercontent.com/smiklosovic/8efcdefadae0b6aae5c7eedd6cc948f7/raw/ae5716d077c1a37b4db901f81620f09d957dd303/gistfile1.txt > > I made a gist from that on PR in case it gets updated / overwritten. > > The logic here is that the "QAT backed compressor" is used first and > when it fails we fallback to the one we have in Cassandra. I have not > found the implementation of that plugin, it is said to be added later > on. > > So it is not about "we start Cassandra and then based on what is > available we pick to de/compress with and if QAT is not available we > fallback to stuff we have already". It is more about "we will put this > plugin on class path, that will effectively override the compressor we > are using AND IF THAT FAILS then while we are de/compressing we > fallback to the default one we have". > > Do you see the slight difference in the semantics here when talking > about "fallbacking"? > > > On Wed, Dec 17, 2025 at 3:14 PM Joseph Lynch <[email protected]> > wrote: > > > > Just noticed the discussion here, I think this is just another case of > "native" code like we've done in the past. We try to load the native > library (try to load up QAT), if that fails then we try finding the fastest > implementation that works on the hardware they have. If you're running on > say arm we are already falling back to pure java implementations of many > things for example (afaik we only have native implementations for crypto, > compression and hashing on x86, but I might have missed the arm patches). > > > > So instead of say x86 native -> fast java (unsafe) -> slow java it would > be qat -> x86 native -> slow java (since afaik we don't want to use unsafe > anymore). A log line helps the operator know _which_ of these they've ended > up with so they can debug why they are spending so many cycles where they > are, but I don't think the fallback is intrinsically hazardous (we already > do transparent fallbacks for TLS, Compression and Hashing afaik). > > > > -Joey > > > > On Wed, Dec 17, 2025 at 1:53 AM Štefan Miklošovič < > [email protected]> wrote: > >> > >> As mentioned, some combination of logging + metrics + maybe dying or > >> something else? > >> > >> I don't know for now, too soon / specific to deal with that, but > >> _something_ should be done, heh. I do not want to block otherwise > >> helpful and valuable contributions on these technicalities, but they > >> should be addressed. > >> > >> The "interesting" aspect of this acceleration hardware is that if it > >> is baked into the CPU and that fails, what are we actually supposed to > >> do with it? I do not know the details too much here but if it > >> hypothetically failed then we are supposed to do what, replace CPU? > >> Does a failure mean that the hardware as such is broken or the failure > >> was just intermittent? If a disk fails we can replace it and restart > >> the machine and rebuild or whatever, or we can just replace the whole > >> node. > >> > >> Anyway, we can always think about that more in follow-up tickets after > >> the initial delivery, but logging in a non-spamming manner + metrics > >> would be the minimum here imho. > >> > >> On Wed, Dec 17, 2025 at 1:27 AM Josh McKenzie <[email protected]> > wrote: > >> > > >> > What if we went the same route we do for disk failure, have a sane > default we collectively believe to be the "majority case", but also have a > configuration knob in cassandra.yaml to choose a hard stop on failure if so > inclined? Complexity is low, maintenance burden should be low. > >> > > >> > These discussions end up spinning trying to find the One Right Answer > when there isn't one. You're right Stefan. And so is Scott. It depends. :) > >> > > >> > On Tue, Dec 16, 2025, at 2:11 PM, Štefan Miklošovič wrote: > >> > > >> > In the scenarios as Scott described it does make sense to fallback but > >> > I am not sure about that when there is a production traffic happening > >> > and we rely on hardware de/compression and _that_ fails silently. > >> > > >> > It is one thing to not fail catastrophically when upgrading or > >> > changing nodes or machines with that hardware are not present etc. and > >> > it is something different to actually expect that data will be > >> > de/compressed with some acceleration and we just swallow the exception > >> > and de/compress in software. > >> > > >> > My perception here is that Cassandra is embracing the philosophy that > >> > if it fails so let it and change the hardware. Heck, we have whole > >> > class of logic around what should happen if there is some kind of a > >> > disk failure. > >> > > >> > While here we are going to act as when the very hardware I am supposed > >> > to de/compress with fails to do so I just fallback to software and ... > >> > that's it? Should not there be some kind of a mechanism to also die > >> > when something goes wrong here? > >> > > >> > On Tue, Dec 16, 2025 at 7:10 PM Josh McKenzie <[email protected]> > wrote: > >> > > > >> > > As a user, I'd rather have a WARN in my logs than to be unable to > start the database without changing cluster-wide configuration like schema > / compaction parameters. > >> > > > >> > > Strong +1 here. > >> > > > >> > > While on the one hand we expect homogenous hardware environments > for clusters, to Scott's point that's not always going to hold true in > containerized and cloud-based environments. Definitely think we need to let > the operators know, but graceful degradation of the database (in a > step-wise plateau-based fashion like this, not a death spiral scenario to > be clear) is much preferred IMO. > >> > > > >> > > On Tue, Dec 16, 2025, at 10:32 AM, Štefan Miklošovič wrote: > >> > > > >> > > Okay I guess that is a good compromise to make here. So warning in > the > >> > > logs + metrics? I think that metrics would be cool to have so we > might > >> > > chart how often it happens etc. > >> > > > >> > > On Tue, Dec 16, 2025 at 4:27 PM C. Scott Andreas < > [email protected]> wrote: > >> > > > > >> > > > One example where lack of a fallback would be problematic is: > >> > > > > >> > > > – User provisions AWS metal-class instances that expose hardware > QAT and adopts. > >> > > > – User needs to expand cluster or replace failed hardware. > >> > > > – Insufficient hardware-QAT-capable machines available from AWS > >> > > > – Cassandra unable to start on replacement/expanded machines due > to lack of fallback. > >> > > > > >> > > > There are a handful of cases where the database performs similar > fallbacks today, such as attempting mlockall on startup for improved memory > locality and to avoid allocation stalls. > >> > > > > >> > > > As a user, I'd rather have a WARN in my logs than to be unable to > start the database without changing cluster-wide configuration like schema > / compaction parameters. > >> > > > > >> > > > – Scott > >> > > > > >> > > > On Dec 16, 2025, at 5:18 AM, Štefan Miklošovič < > [email protected]> wrote: > >> > > > > >> > > > > >> > > > I am open to adding some kind of metrics when it fallsbacks to > track > >> > > > if / how often it failed by hardware etc. Wondering what others > think > >> > > > about fallbacking just like that. I feel like something is not > >> > > > transparent to a user who relies on hardware compression in the > first > >> > > > place. > >> > > > > >> > > > On Tue, Dec 16, 2025 at 1:52 PM Štefan Miklošovič > >> > > > <[email protected]> wrote: > >> > > > > >> > > > > >> > > > My personal preference is to not do any fallbacking. The reason > for > >> > > > that is that failures should be transparent and if it is meant to > fail > >> > > > so be it. > >> > > > > >> > > > If we wrap it in try-catch and fallback, then a user thinks that > >> > > > everything is just fine, right? There is no visibility into > whether > >> > > > and how often it failed so a user can act on that. By > fallbacking, a > >> > > > user is kind of mislead, as they think that all is just fine while > >> > > > they can not wrap they head around the fact that they bought > hardware > >> > > > which says that their compression will be accelerated while > looking at > >> > > > their dashboards and every now and then seeing the same > performance as > >> > > > if they were compressing by software. > >> > > > > >> > > > If they see that it is failing then they can reach out to the > vendor > >> > > > of such hardware, then raise complaints and issues with it so the > >> > > > vendor's engineers can look into why it failed and how to fix it. > >> > > > Instead of just wrapping it in one try-catch and acting like all > is > >> > > > actually fine. A user bought hardware to compress it, I do not > think > >> > > > they are interested in "best-effort" here. If that hardware > fails, or > >> > > > the software which is managing it is erroneous, then it should be > >> > > > either fixed or replaced. > >> > > > > >> > > > On Tue, Dec 16, 2025 at 2:29 AM Kokoori, Shylaja > >> > > > <[email protected]> wrote: > >> > > > > > >> > > > > Hi Stefan, > >> > > > > Thank you very much for the feedback. > >> > > > > You are correct, QAT is available on-die and not hot-plugged, > and under normal circumstances , we shouldn't encounter this exception. > However, wanted to add reverting to base compressor to make it > fault-tolerant. > >> > > > > > >> > > > > While the QAT software stack includes built-in retries and > software fallbacks for scenarios when devices end up being busy etc., I > didn't want operations to fail due to transient hardware issues which > otherwise would have succeeded. An example would be, some unrecoverable > error occurring during a compress/decompress operation—whether due to a > hardware issue or caused by related software libraries—the system can > gracefully revert to the base compressor rather than failing the operation > entirely. > >> > > > > > >> > > > > I am open to other suggestions. > >> > > > > Thanks, > >> > > > > Shylaja > >> > > > > ________________________________ > >> > > > > From: Štefan Miklošovič <[email protected]> > >> > > > > Sent: Monday, December 15, 2025 2:50 PM > >> > > > > To: [email protected] <[email protected]> > >> > > > > Subject: Re: [VOTE] CEP-49: Hardware-accelerated compression > >> > > > > > >> > > > > Hi Shylaja, > >> > > > > > >> > > > > I am going through CEP so I can make the decision when voting > and I > >> > > > > want to clarify a few things. > >> > > > > > >> > > > > You say there: > >> > > > > > >> > > > > Both the default compressor instance and a plugin compressor > instance > >> > > > > (obtained from the provider), will be maintained by Cassandra. > For > >> > > > > subsequent read/write operations, the plugin compressor will be > used. > >> > > > > However, if the plugin version encounters an error, the default > >> > > > > compressor will handle the operation. > >> > > > > > >> > > > > Why are we doing this kind of "fallback"? Under what > circumstances > >> > > > > "the plugin version encounters an error"? Why would it? It > might be > >> > > > > understandable to do it like that if that compression > accelerator > >> > > > > would be some "plug and play" or we could just remove it from a > >> > > > > running machine. But this does not seem to be the case? QAT you > are > >> > > > > mentioning is baked into the CPU, right? It is not like we would > >> > > > > decide to just turn it suddenly off in runtime so the database > would > >> > > > > need to deal with it. > >> > > > > > >> > > > > The reason I am asking is that I just briefly went over the PR > and the > >> > > > > way it works there is that if plugin de/compression is not > possible > >> > > > > (it throws IOException) then it will default to a software > solution. > >> > > > > This is done for every single de/compression of a chunk. > >> > > > > > >> > > > > Is this design the absolute must? > >> > > > > > >> > > > > > >> > > > > On Mon, Dec 15, 2025 at 8:14 PM Josh McKenzie < > [email protected]> wrote: > >> > > > > > > >> > > > > > Yes but it's in reply to the discussion thread and so it > threads that way in clients > >> > > > > > > >> > > > > > Apparently not in fastmail's client because it shows up as > its own thread for me. /sigh > >> > > > > > > >> > > > > > Hence the confusion. Makes sense now. > >> > > > > > > >> > > > > > On Mon, Dec 15, 2025, at 1:18 PM, Kokoori, Shylaja wrote: > >> > > > > > > >> > > > > > Thank you for your feedback, Patrick & Brandon. I have > created a new email thread like you suggested. Hopefully, this works. > >> > > > > > > >> > > > > > -Shylaja > >> > > > > > > >> > > > > > ________________________________ > >> > > > > > From: Patrick McFadin <[email protected]> > >> > > > > > Sent: Monday, December 15, 2025 9:26 AM > >> > > > > > To: [email protected] <[email protected]> > >> > > > > > Subject: Re: [VOTE] CEP-49: Hardware-accelerated compression > >> > > > > > > >> > > > > > That was my point. It's a [DISCUSS] thread. > >> > > > > > > >> > > > > > On Mon, Dec 15, 2025 at 9:22 AM Brandon Williams < > [email protected]> wrote: > >> > > > > > > >> > > > > > On Mon, Dec 15, 2025 at 11:13 AM Josh McKenzie < > [email protected]> wrote: > >> > > > > > > > >> > > > > > > Can you put this into a [VOTE] thread? > >> > > > > > > > >> > > > > > > I'm confused - isn't the subject of this email [VOTE]? > >> > > > > > > >> > > > > > Yes but it's in reply to the discussion thread and so it > threads that > >> > > > > > way in clients, making it easy to overlook. > >> > > > > > > >> > > > > > Kind Regards, > >> > > > > > Brandon > >> > > > > > > >> > > > > > > >> > > > > >> > > > > >> > > > > >> > > > >> > > > >> > > >> > >
