> On 03 Apr 2018, at 18:38, Heikki Linnakangas <hlinn...@iki.fi> wrote: > > On 03/04/18 19:09, Andres Freund wrote: >> Hi, >> On 2018-04-03 19:05:19 +0300, Heikki Linnakangas wrote: >>> On 01/04/18 20:32, Andres Freund wrote: >>>> On 2018-03-06 02:44:35 +0800, Heikki Linnakangas wrote: >>>>> * I tested this on Linux, with gcc and clang, on an ARM64 virtual machine >>>>> that I had available (not an emulator, but a VM on a shared ARM64 server). >>>> >>>> Have you seen actual postgres performance benefits with the patch? >>> >>> I just ran a small test with pg_waldump, similar to what Abhijit Menon-Sen >>> ran with the Slicing-by-8 and Intel SSE patches, when we added those >>> (https://www.postgresql.org/message-id/20141119155811.GA32492%40toroid.org). >>> I ran pgbench, with scale factor 5, until it had generated about 1 GB of >>> WAL, and then I ran pg_waldump -z on that WAL. With slicing-by-8, it took >>> about 7 s, and with the special CPU instructions, about 5 s. 'perf' showed >>> that the CRC computation took about 30% of the CPU time before, and about >>> 12% after, which sounds about right. That's not as big a speedup as we saw >>> with the corresponding Intel SSE instructions back in 2014, but still quite >>> worthwhile. >> Cool. Based on a skim the patch looks reasonable. > > Thanks. > > I bikeshedded with myself on the naming of things, and decided to use "ARMv8" > in the variable and file names, instead of ARM64 or ARMCE or ARM64CE. The CRC > instructions were introduced in ARM v8 (as an optional feature), it's not > really related to the 64-bitness, even though the 64-bit instruction set was > also introduced in ARM v8. Other than that, and some comment fixes, this is > the same as the previous patch version.
Noticed two minor documentation issues in a quick skim of the attached patch: The following line should say SSE and not SSSE (and the same typo is in src/include/pg_config.h.in and src/include/pg_config.h.win32). While not introduced in this patch, it’s adjacent to the patched codepath and on topic so may well be fixed while in there. AC_DEFINE(USE_SSE42_CRC32C_WITH_RUNTIME_CHECK, 1, [Define to 1 to use Intel SSSE 4.2 CRC instructions with a runtime check.]) The documentation in configure.in use “runtime” rather than "run time” in all lines except this one: +# uses the CRC instructions, compile both, and select at run time. cheers ./daniel