Hi, On 2024-07-30 22:12:18 -0500, Nathan Bossart wrote: > On Tue, Jul 30, 2024 at 10:01:50PM -0500, Nathan Bossart wrote: > > On Tue, Jul 30, 2024 at 07:43:08PM -0700, Andres Freund wrote: > >> My point is that _xgetbv() is made available by -mavx512vpopcntdq > >> -mavx512bw > >> alone, without needing -mxsave: > > > > Oh, I see. I'll work on a patch to remove that compiler check, then... > > As I started on this, I remembered why I needed it. The file > pg_popcount_avx512_choose.c is compiled without the AVX-512 flags in order > to avoid inadvertently issuing any AVX-512 instructions before determining > we have support. If that's not a concern, we could still probably remove > the XSAVE check.
I think it's a valid concern - but isn't that theoretically also an issue with xsave itself? I guess practically the compiler won't do that, because there's no practical reason to emit any instructions enabled by -mxsave (in contrast to e.g. -mavx, which does trigger gcc to emit different instructions even for basic math). I think this is one of the few instances where msvc has the right approach - if I use intrinsics to emit a specific instruction, the intrinsic should do so, regardless of whether the compiler is allowed to do so on its own. I think enabling options like these on a per-translation-unit basis isn't really a scalable approach. To actually be safe there could only be a single function in each TU and that function could only be called after a cpuid check performed in a separate TU. That a) ends up pretty unreadable b) requires functions to be implemented in .c files, which we really don't want for some of this. I think we'd be better off enabling architectural features on a per-function basis, roughly like this: https://godbolt.org/z/a4q9Gc6Ez For posterity, in the unlikely case anybody reads this after godbolt shuts down: I'm thinking we'd have an attribute like this: /* * GCC like compilers don't support intrinsics without those intrinsics explicitly * having been enabled. We can't just add these options more widely, as that allows the * compiler to emit such instructions more widely, even if we gate reaching the code using * intrinsics. So we just enable the relevant support for individual functions. * * In contrast to this, msvc allows use of intrinsics independent of what the compiler * otherwise is allowed to emit. */ #ifdef __GNUC__ #define pg_enable_target(foo) __attribute__ ((__target__ (foo))) #else #define pg_enable_target(foo) #endif and then use that selectively for some functions: /* FIXME: Should be gated by configure check of -mavx512vpopcntdq -mavx512bw support */ pg_enable_target("avx512vpopcntdq,avx512bw") uint64_t pg_popcount_avx512(const char *buf, int bytes) ... Greetings, Andres Freund