On 01/04/18 20:32, Andres Freund wrote:
On 2018-03-06 02:44:35 +0800, Heikki Linnakangas wrote:
* I tested this on Linux, with gcc and clang, on an ARM64 virtual machine
that I had available (not an emulator, but a VM on a shared ARM64 server).
Have you seen actual postgres performance benefits with the patch?
I just ran a small test with pg_waldump, similar to what Abhijit
Menon-Sen ran with the Slicing-by-8 and Intel SSE patches, when we added
those
(https://www.postgresql.org/message-id/20141119155811.GA32492%40toroid.org).
I ran pgbench, with scale factor 5, until it had generated about 1 GB of
WAL, and then I ran pg_waldump -z on that WAL. With slicing-by-8, it
took about 7 s, and with the special CPU instructions, about 5 s. 'perf'
showed that the CRC computation took about 30% of the CPU time before,
and about 12% after, which sounds about right. That's not as big a
speedup as we saw with the corresponding Intel SSE instructions back in
2014, but still quite worthwhile.
- Heikki