Thank you, Scott and everyone for all the feedback you have given me. Answering Scott’s questions,
– The CEP includes a section labeled "Setup used to generate results above", but I don't see a summary of results, graphs, or benchmark details. Could you point me to them or upload to the Confluence wiki? I have updated the CEP document with results and workloads we used. Please let me know if you need additional information. – How could the Apache Cassandra project exercise this feature with acceleration in our CI? I doubt our current hardware supports this feature, which would make it untestable as part of the project's release process if so. There are a few options available like Intel Developer Cloud which provides virtual access to Intel's latest hardware, including QAT-enabled processors, allowing developers to test and optimize applications without needing physical hardware. – Are there cloud providers that support it; and if so who and which instance types? I see that AWS offers QAT in the r7i instance family – but only for "metal" instances which are much more expensive and uncommon to use: https://aws.amazon.com/ec2/instance-types/r7i/. Currently, US-based cloud providers like GCP and AWS do not offer QAT-enabled VMs, with AWS providing QAT support only in metal instances within the r7i family, like you mentioned. However, cloud service providers in other geographies, such as Alibaba Cloud, do offer QAT VM access, providing a more flexible option for utilizing QAT in the cloud. – Outside of cloud providers, how does one determine if a given processor supports QAT (i.e., a specific list of chips)? I've tried browsing the QAT website but the PDFs don't seem to contain this information, and I don't see QAT as a filterable attribute on ark.apple.com. To determine if a given processor supports QAT, the lspci command can be used, which is documented in Intel's QAT Library Requirements<https://intel.github.io/quickassist/qatlib/requirements.html#supported-devices>. The command is: echo `(lspci -d 8086:4940 && lspci -d 8086:4941 && lspci -d 8086:4942 && lspci -d 8086:4943 && lspci -d 8086:4944 && lspci -d 8086:4945 && lspci -d 8086:4946 && lspci -d 8086:4947) | wc -l` supported devices found. This will help identify supported devices on the system. – Can you describe Intel's commitment to long-term maintenance/support of this feature? Valid point, Intel is dedicated to the long-term evolution and maintenance of its technologies, like QAT and in maintaining and testing its associated software, be it in-tree or out-of-tree. -I'd feel more comfortable with this proposal in-tree if it were to take advantage of a standard and widely-available instruction set or extension (like AES-NI, AVX, or NEON). But if it requires specific chip models or a dedicated PCIe card, metal-only cloud instances, and a custom Linux driver to enable, it would seem most appropriate for the feature to live out of tree but possible for users of QAT to enable via adding the jar to their classpath. Regarding integration, QAT kernel drivers are integrated in-tree, meaning they are part of the standard Linux kernel distribution. The drivers will be maintained and updated alongside the kernel. Also, user-space libraries for QAT are available through standard package repositories, making them easy to install and manage. Based on all the feedback, looks like the general consensus is to have the capability pluggable for now. We will modify the code and come back with the changes in a few weeks. Thank you, Shylaja From: C. Scott Andreas <sc...@paradoxica.net> Sent: Monday, June 09, 2025 8:51 PM To: dev@cassandra.apache.org Cc: dev@cassandra.apache.org Subject: Re: [DISCUSS] CEP-49: Hardware-accelerated compression Shylaja, thanks for your proposal and messages on this thread. I share Jeff's questions and have a couple more. I appreciate that there is a software fallback, but want to ensure that members of the development community are able to test this feature; and to understand Intel's long-term commitment to evolving and maintaining a vendor- and model-specific codec family in Apache Cassandra. – The CEP includes a section labeled "Setup used to generate results above", but I don't see a summary of results, graphs, or benchmark details. Could you point me to them or upload to the Confluence wiki? – How could the Apache Cassandra project exercise this feature with acceleration in our CI? I doubt our current hardware supports this feature, which would make it untestable as part of the project's release process if so. – Are there cloud providers that support it; and if so who and which instance types? I see that AWS offers QAT in the r7i instance family – but only for "metal" instances which are much more expensive and uncommon to use: https://aws.amazon.com/ec2/instance-types/r7i/. – Outside of cloud providers, how does one determine if a given processor supports QAT (i.e., a specific list of chips)? I've tried browsing the QAT website but the PDFs don't seem to contain this information, and I don't see QAT as a filterable attribute on ark.apple.com. – Is there a member of the Cassandra community (or Intel directly) who commits to run the database in a production capacity using one or more QAT codecs? – Can you describe Intel's commitment to long-term maintenance/support of this feature? I'd feel more comfortable with this proposal in-tree if it were to take advantage of a standard and widely-available instruction set or extension (like AES-NI, AVX, or NEON). But if it requires specific chip models or a dedicated PCIe card, metal-only cloud instances, and a custom Linux driver to enable, it would seem most appropriate for the feature to live out of tree but possible for users of QAT to enable via adding the jar to their classpath. – Scott On Jun 9, 2025, at 3:13 PM, "Kokoori, Shylaja" <shylaja.koko...@intel.com<mailto:shylaja.koko...@intel.com>> wrote: Hi Jeff, Thank you very much for your response. I understand your concern. Here are some details, hope it helps. Customers have access to Intel Product SKUs with or without Intel® QAT. Our SW Architecture abstracts the use of the Intel® QAT Hardware such that if it is not present on a given Intel CPU, or the Customer is using a competitive Intel Architecture CPU, the QATZip Library shown below can call the SW Compression Library instead of calling for a Hardware response of the Intel® QAT Accelerator. Therefore customers can use the same SW architecture whether or not the HW accelerator is present. <image001.png> See the following Technical Paper on using Intel® QAT and the QATZip Library. This paper discusses why QATzip exists, its applications, value for developers, and reviews the performance gains. Intel® QuickAssist Technology - Deliver Compression Efficiencies in the Cloud with Intel® QAT and QATzip Solution Brief<https://www.intel.com/content/www/us/en/content-details/767068/intel-quickassist-technology-deliver-compression-efficiencies-in-the-cloud-with-intel-qat-and-qatzip-solution-brief.html> Thank you, Shylaja From: Jeff Jirsa <jji...@gmail.com<mailto:jji...@gmail.com>> Sent: Thursday, June 05, 2025 12:13 PM To: dev@cassandra.apache.org<mailto:dev@cassandra.apache.org> Cc: dev@cassandra.apache.org<mailto:dev@cassandra.apache.org> Subject: Re: [DISCUSS] CEP-49: Hardware-accelerated compression One perpetual challenge with customizing codebase for dedicated hardware is ongoing support / testing / maintenance followed by ensuring vendor agnostic / neutral access QAT is one of those things that’s great for a set of people paying for it, but I don’t know any current contributors who have access - does anyone not at Intel actually have access to QAT or does intel expect to use this in production yourselves (are you running a cluster where this fix improves your life or are you proposing this as a way to benefit your customers, or both)? On Jun 5, 2025, at 11:52 AM, Kokoori, Shylaja <shylaja.koko...@intel.com<mailto:shylaja.koko...@intel.com>> wrote: Hi everyone, We would like to propose hardware accelerated compression in Cassandra, CEP-49: Hardware-accelerated compression<https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-49%3A+Hardware-accelerated+compression> As load on Cassandra servers increase, performance of compress/decompress operations starts becoming a bottleneck. Our goal via this CEP is to offload these operations to dedicated hardware accelerators and free up the CPUs. We'd really appreciate your feedback on this proposal. Thank you, Shylaja