On Sun, Aug 23, 2020 at 9:03 AM Uros Bizjak <ubiz...@gmail.com> wrote: > > On Sun, Aug 23, 2020 at 5:23 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > On Sun, Aug 23, 2020 at 10:18:28AM +0200, Uros Bizjak wrote: > > > On Sat, Aug 22, 2020 at 9:09 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > > > > > > > > Compile CPUID check with "-mno-sse -mfpmath=387" to disable SSE, > > > > > > AVX and > > > > > > AVX512 during CPUID check to avoid vector and mask register > > > > > > operations. > > > > > > > > > > -mgeneral-regs-only ? > > > > > > > > > > > > > Here is a patch to add target("general-regs-only") function > > > > attribute and use it for CPUID check. OK for master if there > > > > are no regressions? > > > > > > Please test it first, then ask for an approval. > > > > > > Please submit the general-regs-only part as an independent patch. (I > > > think this is the option linux should use for compilation). > > > > > > OTOH, wrapping CPUID check in a target attribute is a bad idea. We > > > should disable spills to mask registers for generic targets by either > > > raising costs of moves between general and mask registers and/or (as > > > suggested earlier) introducing TARGET_SPILL_TO_MASK_REGS tuning and > > > use it in secondary_memory_needed to prevent inter register unit > > > spills. > > > > > > So, compiling with -mavx512bw would NOT enable spills by default, > > > where compiling with -march=skylake-avx512 (or using equivalent > > > -mtune) would. This is IMO the least surprising approach, and would > > > avoid changing sources (as you now have to do for several testcases). > > > > We have 2 orthogonal issues here: > > > > 1. When mask register spill should be enabled. > > 2. CPUID check should be done with general registers only. > > > > As shown in GCC testcases, CPUID check may be done with arbitrary ISAs > > or -march/-mtune options enabled. We should either > > > > 1. Enable only general registers for CPUID check. Or > > 2. Issue an error for CPUID check if non-general registers are used. > > We should follow the same approach as with SSE2, where DI/SImode > spills to XMM registers were effectively disabled for a generic > target. So, unless the tuning target is also specified, spills to mask > registers should not be generated. It was my oversight to approve the > patch that enables spills for a generic target, and without the tuning > flag, the patch will be reverted. > > Now, we have -mgeneral-regs-only functionality in place, so if a > package wants to enable spills, the correct -mtune (ro -march that > implies -mtune) should be used, and it is expected that the detection > code is amended with general-regs-only pragmas. > > <footnote > > Speaking of pragmas, these should be added outside cpuid.h, like: > > #pragma GCC push_options > #pragma GCC target("general-regs-only") > > #include <cpuid.h> > > void cpuid_check () > ... > > #pragma GCC pop_options > > >footnote > > Nowadays, -march=native is mostly used outside generic target > compilations, so for relevant avx512 targets, we still generate spills > to mask regs. In future, we can review the setting of the tuning flag > for a generic target in the same way as with SSE2 inter-reg moves. >
Florian raised an issue that we need to limit <cpuid.h> to the basic ISAs. <cpuid.h> should be handled similarly to other intrinsic header files. That is <cpuid.h> should use #pragma GCC push_options #ifdef __x86_64__ #pragma GCC target("arch=x86-64") #else #pragma GCC target("arch=i386") ... #pragma GCC pop_options Here is a patch. OK for master? Thanks. -- H.J.