>There was already a proposal to use armv8-a+crypto, which is more widely available and works on smaller inputs.
Our implementation with SVE2 is able to gain better performance than https://www.postgresql.org/message-id/CANWCAZaKhE%2BRD5KKouUFoxx1EbUNrNhcduM1VQ%3DDkSDadNEFng%40mail.gmail.com I've benchmarked our SVE2 implementation against armv8-a+crypto, and the results show substantial improvements. Buffer size (bytes) armv8+crypto (in ms) armv9+SVE2 (in ms) Improvement 512 28.491 19.37 32.0% faster 1024 47.145 29.962 36.5% faster 2048 86.717 52.841 39.1% faster 4096 165.205 105.626 36.1% faster 8192 318.103 226.437 28.8% faster These buffer sizes are particularly relevant for PostgreSQL workloads: * 8KB: Default page size (28.8% faster checksumming) * 4KB: Alternative page size configuration (36.1% faster) * 512B-2KB: Typical WAL record sizes (32-39% faster) * 2KB: TOAST chunk size (39% faster) While armv8-a+crypto has broader current deployment, SVE2 is already available in production cloud infrastructure: AWS Graviton 4, Ampere AmpereOne, and NVIDIA Grace (all released 2023). As ARMv9 adoption continues, these gains become increasingly relevant. Rather than choosing one approach over the other, perhaps we could implement both with runtime CPU detection? Since we already perform runtime detection for crypto extension availability, adding an additional check for SVE2 introduces no performance degradation on systems without SVE2, while providing significant performance gains (28-39%) on systems that do support it. This would provide optimal performance on capable hardware while maintaining broad compatibility. Please let me know your thoughts. static pg_crc32c (*pg_comp_crc32c_armv8)(pg_crc32c crc, const void *data, size_t len); void pg_comp_crc32c_choose_armv8(void) { if (pg_cpu_has_sve2()) pg_comp_crc32c_armv8 = pg_comp_crc32c_armv8_sve2; else if (pg_cpu_has_crypto()) pg_comp_crc32c_armv8 = pg_comp_crc32c_armv8_crypto; else pg_comp_crc32c_armv8 = pg_comp_crc32c_sb8; // scalar fallback } Thanks, Susmitha Devanga. ________________________________ From: John Naylor <[email protected]> Sent: Friday, December 19, 2025 08:27 To: Susmitha, Devanga <[email protected]> Cc: pgsql-hackers <[email protected]>; Hajela, Ragesh <[email protected]>; Bhattacharya, Chiranmoy <[email protected]> Subject: Re: [PATCH] CRC32C optimizations using SVE2 on ARM. On Fri, Dec 19, 2025 at 4:20 AM [email protected] <[email protected]> wrote: > For architecture-specific functions, we use > pg_attribute_target("arch=armv9-a+sve2-aes") There was already a proposal to use armv8-a+crypto, which is more widely available and works on smaller inputs. Perhaps you'd be interested in reviewing and testing? https://www.postgresql.org/message-id/CANWCAZaKhE%2BRD5KKouUFoxx1EbUNrNhcduM1VQ%3DDkSDadNEFng%40mail.gmail.com > to ensure precise compilation control without modifying global CFLAGS, > enabling a clean integration within PostgreSQL’s build system. I think the reason we continue to use CFLAGS here was that clang support for target attributes on Arm is fairly recent. It's probably too soon to reconsider that. -- John Naylor Amazon Web Services
