On Wed, Jul 31, 2024 at 01:52:54PM -0700, Andres Freund wrote: > On 2024-07-30 22:12:18 -0500, Nathan Bossart wrote: >> As I started on this, I remembered why I needed it. The file >> pg_popcount_avx512_choose.c is compiled without the AVX-512 flags in order >> to avoid inadvertently issuing any AVX-512 instructions before determining >> we have support. If that's not a concern, we could still probably remove >> the XSAVE check. > > I think it's a valid concern - but isn't that theoretically also an issue with > xsave itself? I guess practically the compiler won't do that, because there's > no practical reason to emit any instructions enabled by -mxsave (in contrast > to e.g. -mavx, which does trigger gcc to emit different instructions even for > basic math).
Yeah, this crossed my mind. It's certainly not the sturdiest of assumptions... > I think enabling options like these on a per-translation-unit basis isn't > really a scalable approach. To actually be safe there could only be a single > function in each TU and that function could only be called after a cpuid check > performed in a separate TU. That a) ends up pretty unreadable b) requires > functions to be implemented in .c files, which we really don't want for some > of this. Agreed. > I think we'd be better off enabling architectural features on a per-function > basis, roughly like this: > > [...] > > /* FIXME: Should be gated by configure check of -mavx512vpopcntdq -mavx512bw > support */ > pg_enable_target("avx512vpopcntdq,avx512bw") > uint64_t > pg_popcount_avx512(const char *buf, int bytes) > ... I remember wondering why the CRC-32C code wasn't already doing something like this (old compiler versions? non-gcc-like compilers?), and I'm not sure I ever discovered the reason, so out of an abundance of caution I used the same approach for AVX-512. If we can convince ourselves that __attribute__((target("..."))) is standard enough at this point, +1 for moving to that. -- nathan