On Wed, Apr 30, 2025 at 4:53 AM Salvatore Dipietro <dipietro.salvat...@gmail.com> wrote: > we would like to propose the removal of the Instruction > Synchronization Barrier (isb) for aarch64 architectures. Based on our > testing on Graviton instances (m7g.16xlarge), we can see on average > over multiple iterations up to 12% better performance using PGBench > select-only and up to 9% with Sysbench oltp_read_only workloads. On > Graviton4 (m8g.24xlarge) results are up to 8% better using PGBench > select-only and up to 6% with Sysbench oltp_read_only workloads. > We have also tested it putting more pressure on the spin_delay > function, enabling pg_stat_statements.track_planning with PGBench > read-only [0] and, on average, the patch shows up to 27% better > performance on m6g.16xlarge and up to 37% on m7g.16xlarge.
Hmm. This was added only 3 years ago, supposedly because it made performance better: commit a82a5eee314df52f3183cedc0ecbcac7369243b1 Author: Tom Lane <t...@sss.pgh.pa.us> Date: Wed Apr 6 18:57:57 2022 -0400 Use ISB as a spin-delay instruction on ARM64. This seems beneficial on high-core-count machines, and not harmful on lesser hardware. However, older ARM32 gear doesn't have this instruction, so restrict the patch to ARM64. Geoffrey Blake Discussion: https://postgr.es/m/78338f29-9d7f-4dc8-bd71-e9674ce71...@amazon.com I think you should make some kind of argument about why the previous conclusion was wrong, or why something's changed between then and now. -- Robert Haas EDB: http://www.enterprisedb.com