Re: [VOTE] CEP-49: Hardware-accelerated compression

Joseph Lynch Wed, 17 Dec 2025 06:58:44 -0800

Does QAT not provide a way to detect what the hardware supports and return
that capability at construction time so we can pick the fastest
implementation that the hardware supports? That seems like a more robust
way than inline exception handling and consistent with how we do the other
native fallbacks, where we probe if they are available and fallback to a
different instance entirely if not available.


Agree the inline try-catch is inelegant and implies that sometimes QAT can
succeed and sometimes it can fail. That should not be the case (either
hardware acceleration exists or not).

-Joey

On Wed, Dec 17, 2025 at 9:31 AM Štefan Miklošovič <[email protected]>
wrote:

> To be explicit, we are talking about this kind of fallbacking:
>
>
> https://gist.githubusercontent.com/smiklosovic/8efcdefadae0b6aae5c7eedd6cc948f7/raw/ae5716d077c1a37b4db901f81620f09d957dd303/gistfile1.txt
>
> I made a gist from that on PR in case it gets updated / overwritten.
>
> The logic here is that the "QAT backed compressor" is used first and
> when it fails we fallback to the one we have in Cassandra. I have not
> found the implementation of that plugin, it is said to be added later
> on.
>
> So it is not about "we start Cassandra and then based on what is
> available we pick to de/compress with and if QAT is not available we
> fallback to stuff we have already". It is more about "we will put this
> plugin on class path, that will effectively override the compressor we
> are using AND IF THAT FAILS then while we are de/compressing we
> fallback to the default one we have".
>
> Do you see the slight difference in the semantics here when talking
> about "fallbacking"?
>
>
> On Wed, Dec 17, 2025 at 3:14 PM Joseph Lynch <[email protected]>
> wrote:
> >
> > Just noticed the discussion here, I think this is just another case of
> "native" code like we've done in the past. We try to load the native
> library (try to load up QAT), if that fails then we try finding the fastest
> implementation that works on the hardware they have. If you're running on
> say arm we are already falling back to pure java implementations of many
> things for example (afaik we only have native implementations for crypto,
> compression and hashing on x86, but I might have missed the arm patches).
> >
> > So instead of say x86 native -> fast java (unsafe) -> slow java it would
> be qat -> x86 native -> slow java (since afaik we don't want to use unsafe
> anymore). A log line helps the operator know _which_ of these they've ended
> up with so they can debug why they are spending so many cycles where they
> are, but I don't think the fallback is intrinsically hazardous (we already
> do transparent fallbacks for TLS, Compression and Hashing afaik).
> >
> > -Joey
> >
> > On Wed, Dec 17, 2025 at 1:53 AM Štefan Miklošovič <
> [email protected]> wrote:
> >>
> >> As mentioned, some combination of logging + metrics + maybe dying or
> >> something else?
> >>
> >> I don't know for now, too soon / specific to deal with that, but
> >> _something_ should be done, heh. I do not want to block otherwise
> >> helpful and valuable contributions on these technicalities, but they
> >> should be addressed.
> >>
> >> The "interesting" aspect of this acceleration hardware is that if it
> >> is baked into the CPU and that fails, what are we actually supposed to
> >> do with it? I do not know the details too much here but if it
> >> hypothetically failed then we are supposed to do what, replace CPU?
> >> Does a failure mean that the hardware as such is broken or the failure
> >> was just intermittent? If a disk fails we can replace it and restart
> >> the machine and rebuild or whatever, or we can just replace the whole
> >> node.
> >>
> >> Anyway, we can always think about that more in follow-up tickets after
> >> the initial delivery, but logging in a non-spamming manner + metrics
> >> would be the minimum here imho.
> >>
> >> On Wed, Dec 17, 2025 at 1:27 AM Josh McKenzie <[email protected]>
> wrote:
> >> >
> >> > What if we went the same route we do for disk failure, have a sane
> default we collectively believe to be the "majority case", but also have a
> configuration knob in cassandra.yaml to choose a hard stop on failure if so
> inclined? Complexity is low, maintenance burden should be low.
> >> >
> >> > These discussions end up spinning trying to find the One Right Answer
> when there isn't one. You're right Stefan. And so is Scott. It depends. :)
> >> >
> >> > On Tue, Dec 16, 2025, at 2:11 PM, Štefan Miklošovič wrote:
> >> >
> >> > In the scenarios as Scott described it does make sense to fallback but
> >> > I am not sure about that when there is a production traffic happening
> >> > and we rely on hardware de/compression and _that_ fails silently.
> >> >
> >> > It is one thing to not fail catastrophically when upgrading or
> >> > changing nodes or machines with that hardware are not present etc. and
> >> > it is something different to actually expect that data will be
> >> > de/compressed with some acceleration and we just swallow the exception
> >> > and de/compress in software.
> >> >
> >> > My perception here is that Cassandra is embracing the philosophy that
> >> > if it fails so let it and change the hardware. Heck, we have whole
> >> > class of logic around what should happen if there is some kind of a
> >> > disk failure.
> >> >
> >> > While here we are going to act as when the very hardware I am supposed
> >> > to de/compress with fails to do so I just fallback to software and ...
> >> > that's it? Should not there be some kind of a mechanism to also die
> >> > when something goes wrong here?
> >> >
> >> > On Tue, Dec 16, 2025 at 7:10 PM Josh McKenzie <[email protected]>
> wrote:
> >> > >
> >> > > As a user, I'd rather have a WARN in my logs than to be unable to
> start the database without changing cluster-wide configuration like schema
> / compaction parameters.
> >> > >
> >> > > Strong +1 here.
> >> > >
> >> > > While on the one hand we expect homogenous hardware environments
> for clusters, to Scott's point that's not always going to hold true in
> containerized and cloud-based environments. Definitely think we need to let
> the operators know, but graceful degradation of the database (in a
> step-wise plateau-based fashion like this, not a death spiral scenario to
> be clear) is much preferred IMO.
> >> > >
> >> > > On Tue, Dec 16, 2025, at 10:32 AM, Štefan Miklošovič wrote:
> >> > >
> >> > > Okay I guess that is a good compromise to make here. So warning in
> the
> >> > > logs + metrics? I think that metrics would be cool to have so we
> might
> >> > > chart how often it happens etc.
> >> > >
> >> > > On Tue, Dec 16, 2025 at 4:27 PM C. Scott Andreas <
> [email protected]> wrote:
> >> > > >
> >> > > > One example where lack of a fallback would be problematic is:
> >> > > >
> >> > > > – User provisions AWS metal-class instances that expose hardware
> QAT and adopts.
> >> > > > – User needs to expand cluster or replace failed hardware.
> >> > > > – Insufficient hardware-QAT-capable machines available from AWS
> >> > > > – Cassandra unable to start on replacement/expanded machines due
> to lack of fallback.
> >> > > >
> >> > > > There are a handful of cases where the database performs similar
> fallbacks today, such as attempting mlockall on startup for improved memory
> locality and to avoid allocation stalls.
> >> > > >
> >> > > > As a user, I'd rather have a WARN in my logs than to be unable to
> start the database without changing cluster-wide configuration like schema
> / compaction parameters.
> >> > > >
> >> > > > – Scott
> >> > > >
> >> > > > On Dec 16, 2025, at 5:18 AM, Štefan Miklošovič <
> [email protected]> wrote:
> >> > > >
> >> > > >
> >> > > > I am open to adding some kind of metrics when it fallsbacks to
> track
> >> > > > if / how often it failed by hardware etc. Wondering what others
> think
> >> > > > about fallbacking just like that. I feel like something is not
> >> > > > transparent to a user who relies on hardware compression in the
> first
> >> > > > place.
> >> > > >
> >> > > > On Tue, Dec 16, 2025 at 1:52 PM Štefan Miklošovič
> >> > > > <[email protected]> wrote:
> >> > > >
> >> > > >
> >> > > > My personal preference is to not do any fallbacking. The reason
> for
> >> > > > that is that failures should be transparent and if it is meant to
> fail
> >> > > > so be it.
> >> > > >
> >> > > > If we wrap it in try-catch and fallback, then a user thinks that
> >> > > > everything is just fine, right? There is no visibility into
> whether
> >> > > > and how often it failed so a user can act on that. By
> fallbacking, a
> >> > > > user is kind of mislead, as they think that all is just fine while
> >> > > > they can not wrap they head around the fact that they bought
> hardware
> >> > > > which says that their compression will be accelerated while
> looking at
> >> > > > their dashboards and every now and then seeing the same
> performance as
> >> > > > if they were compressing by software.
> >> > > >
> >> > > > If they see that it is failing then they can reach out to the
> vendor
> >> > > > of such hardware, then raise complaints and issues with it so the
> >> > > > vendor's engineers can look into why it failed and how to fix it.
> >> > > > Instead of just wrapping it in one try-catch and acting like all
> is
> >> > > > actually fine. A user bought hardware to compress it, I do not
> think
> >> > > > they are interested in "best-effort" here. If that hardware
> fails, or
> >> > > > the software which is managing it is erroneous, then it should be
> >> > > > either fixed or replaced.
> >> > > >
> >> > > > On Tue, Dec 16, 2025 at 2:29 AM Kokoori, Shylaja
> >> > > > <[email protected]> wrote:
> >> > > > >
> >> > > > > Hi Stefan,
> >> > > > > Thank you very much for the feedback.
> >> > > > > You are correct, QAT is available on-die and not hot-plugged,
> and under normal circumstances , we shouldn't encounter this exception.
> However, wanted to add reverting to base compressor to make it
> fault-tolerant.
> >> > > > >
> >> > > > > While the QAT software stack includes built-in retries and
> software fallbacks for scenarios when devices end up being busy etc., I
> didn't want operations to fail due to transient hardware issues which
> otherwise would have succeeded. An example would be, some unrecoverable
> error occurring during a compress/decompress operation—whether due to a
> hardware issue or caused by related software libraries—the system can
> gracefully revert to the base compressor rather than failing the operation
> entirely.
> >> > > > >
> >> > > > > I am open to other suggestions.
> >> > > > > Thanks,
> >> > > > > Shylaja
> >> > > > > ________________________________
> >> > > > > From: Štefan Miklošovič <[email protected]>
> >> > > > > Sent: Monday, December 15, 2025 2:50 PM
> >> > > > > To: [email protected] <[email protected]>
> >> > > > > Subject: Re: [VOTE] CEP-49: Hardware-accelerated compression
> >> > > > >
> >> > > > > Hi Shylaja,
> >> > > > >
> >> > > > > I am going through CEP so I can make the decision when voting
> and I
> >> > > > > want to clarify a few things.
> >> > > > >
> >> > > > > You say there:
> >> > > > >
> >> > > > > Both the default compressor instance and a plugin compressor
> instance
> >> > > > > (obtained from the provider), will be maintained by Cassandra.
> For
> >> > > > > subsequent read/write operations, the plugin compressor will be
> used.
> >> > > > > However, if the plugin version encounters an error, the default
> >> > > > > compressor will handle the operation.
> >> > > > >
> >> > > > > Why are we doing this kind of "fallback"? Under what
> circumstances
> >> > > > > "the plugin version encounters an error"? Why would it? It
> might be
> >> > > > > understandable to do it like that if that compression
> accelerator
> >> > > > > would be some "plug and play" or we could just remove it from a
> >> > > > > running machine. But this does not seem to be the case? QAT you
> are
> >> > > > > mentioning is baked into the CPU, right? It is not like we would
> >> > > > > decide to just turn it suddenly off in runtime so the database
> would
> >> > > > > need to deal with it.
> >> > > > >
> >> > > > > The reason I am asking is that I just briefly went over the PR
> and the
> >> > > > > way it works there is that if plugin de/compression is not
> possible
> >> > > > > (it throws IOException) then it will default to a software
> solution.
> >> > > > > This is done for every single de/compression of a chunk.
> >> > > > >
> >> > > > > Is this design the absolute must?
> >> > > > >
> >> > > > >
> >> > > > > On Mon, Dec 15, 2025 at 8:14 PM Josh McKenzie <
> [email protected]> wrote:
> >> > > > > >
> >> > > > > > Yes but it's in reply to the discussion thread and so it
> threads that way in clients
> >> > > > > >
> >> > > > > > Apparently not in fastmail's client because it shows up as
> its own thread for me. /sigh
> >> > > > > >
> >> > > > > > Hence the confusion. Makes sense now.
> >> > > > > >
> >> > > > > > On Mon, Dec 15, 2025, at 1:18 PM, Kokoori, Shylaja wrote:
> >> > > > > >
> >> > > > > > Thank you for your feedback, Patrick & Brandon. I have
> created a new email thread like you suggested. Hopefully, this works.
> >> > > > > >
> >> > > > > > -Shylaja
> >> > > > > >
> >> > > > > > ________________________________
> >> > > > > > From: Patrick McFadin <[email protected]>
> >> > > > > > Sent: Monday, December 15, 2025 9:26 AM
> >> > > > > > To: [email protected] <[email protected]>
> >> > > > > > Subject: Re: [VOTE] CEP-49: Hardware-accelerated compression
> >> > > > > >
> >> > > > > > That was my point. It's a [DISCUSS] thread.
> >> > > > > >
> >> > > > > > On Mon, Dec 15, 2025 at 9:22 AM Brandon Williams <
> [email protected]> wrote:
> >> > > > > >
> >> > > > > > On Mon, Dec 15, 2025 at 11:13 AM Josh McKenzie <
> [email protected]> wrote:
> >> > > > > > >
> >> > > > > > > Can you put this into a [VOTE] thread?
> >> > > > > > >
> >> > > > > > > I'm confused - isn't the subject of this email [VOTE]?
> >> > > > > >
> >> > > > > > Yes but it's in reply to the discussion thread and so it
> threads that
> >> > > > > > way in clients, making it easy to overlook.
> >> > > > > >
> >> > > > > > Kind Regards,
> >> > > > > > Brandon
> >> > > > > >
> >> > > > > >
> >> > > >
> >> > > >
> >> > > >
> >> > >
> >> > >
> >> >
> >> >
>

Re: [VOTE] CEP-49: Hardware-accelerated compression

Reply via email to