>There was already a proposal to use armv8-a+crypto, which is more
widely available and works on smaller inputs.

Our implementation with SVE2 is able to gain better performance than
https://www.postgresql.org/message-id/CANWCAZaKhE%2BRD5KKouUFoxx1EbUNrNhcduM1VQ%3DDkSDadNEFng%40mail.gmail.com

I've benchmarked our SVE2 implementation against armv8-a+crypto, and the 
results show substantial improvements.

Buffer size (bytes)
               armv8+crypto (in ms)
             armv9+SVE2 (in ms)
                   Improvement
512
28.491
19.37
                  32.0% faster
1024
47.145
29.962
                  36.5% faster
2048
86.717
52.841
                  39.1% faster
4096
165.205
105.626
                  36.1% faster
8192
318.103
226.437
                  28.8% faster

These buffer sizes are particularly relevant for PostgreSQL workloads:

  *   8KB: Default page size (28.8% faster checksumming)
  *   4KB: Alternative page size configuration (36.1% faster)
  *   512B-2KB: Typical WAL record sizes (32-39% faster)
  *   2KB: TOAST chunk size (39% faster)


While armv8-a+crypto has broader current deployment, SVE2 is already available 
in production cloud infrastructure: AWS Graviton 4, Ampere AmpereOne, and 
NVIDIA Grace (all released 2023). As ARMv9 adoption continues, these gains 
become increasingly relevant.
Rather than choosing one approach over the other, perhaps we could implement 
both with runtime CPU detection? Since we already perform runtime detection for 
crypto extension availability, adding an additional check for SVE2 introduces 
no performance degradation on systems without SVE2, while providing significant 
performance gains (28-39%) on systems that do support it. This would provide 
optimal performance on capable hardware while maintaining broad compatibility. 
Please let me know your thoughts.


static pg_crc32c (*pg_comp_crc32c_armv8)(pg_crc32c crc, const void *data, 
size_t len);
void pg_comp_crc32c_choose_armv8(void)
{
    if (pg_cpu_has_sve2())
        pg_comp_crc32c_armv8 = pg_comp_crc32c_armv8_sve2;
    else if (pg_cpu_has_crypto())
        pg_comp_crc32c_armv8 = pg_comp_crc32c_armv8_crypto;
    else
        pg_comp_crc32c_armv8 = pg_comp_crc32c_sb8; // scalar fallback
}



Thanks,
Susmitha Devanga.



________________________________
From: John Naylor <[email protected]>
Sent: Friday, December 19, 2025 08:27
To: Susmitha, Devanga <[email protected]>
Cc: pgsql-hackers <[email protected]>; Hajela, Ragesh 
<[email protected]>; Bhattacharya, Chiranmoy 
<[email protected]>
Subject: Re: [PATCH] CRC32C optimizations using SVE2 on ARM.

On Fri, Dec 19, 2025 at 4:20 AM [email protected]
<[email protected]> wrote:
> For architecture-specific functions, we use 
> pg_attribute_target("arch=armv9-a+sve2-aes")

There was already a proposal to use armv8-a+crypto, which is more
widely available and works on smaller inputs. Perhaps you'd be
interested in reviewing and testing?

https://www.postgresql.org/message-id/CANWCAZaKhE%2BRD5KKouUFoxx1EbUNrNhcduM1VQ%3DDkSDadNEFng%40mail.gmail.com

> to ensure precise compilation control without modifying global CFLAGS, 
> enabling a clean integration within PostgreSQL’s build system.

I think the reason we continue to use CFLAGS here was that clang
support for target attributes on Arm is fairly recent. It's probably
too soon to reconsider that.

--
John Naylor
Amazon Web Services

Reply via email to