https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108191

--- Comment #7 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
(In reply to 罗勇刚(Yonggang Luo) from comment #6)
> Is the following command are valid usage? It's compiled properly

No, this is invalid.
> 
> ```
> 
> // compile args:  -fPIC -O2 -D__SSE3__=1 -D__SSSE3__=1 -D__SSE4_1__=1
> -D__SSE4_2__=1 -D__SSE4A__=1 -D__POPCNT__=1 -D__XSAVE__=1 -D__CRC32__=1
> -D__AVX__=1 -D__AVX2__=1 -D__FP_FAST_FMAF32=1 -D__FP_FAST_FMAF64=1
> -D__FP_FAST_FMAF=1 -D__FP_FAST_FMAF32x=1 -D__AVX512F__=1 -D__AVX512CD__=1

Only -fPIC -O2 here, none of the -D arguments, all of them are internal
GCC macros that shouldn't be redefined by users.
Plus it isn't needed.

> #include <math.h>
> 
> #pragma GCC push_options
> #pragma GCC target("avx512f")
> #pragma GCC target("avx512cd")
> #pragma GCC target("sse4a")
> 
> #if defined(_MSC_VER)
> #include <intrin.h>
> #else
> #include <x86intrin.h>
> #endif
> 
> #pragma GCC pop_options

You can do it, but for GCC it is completely useless, you can just
#include <x86intrin.h> without anything further.

> #pragma GCC push_options
> #pragma GCC target("avx512f")
> #pragma GCC target("avx512cd")
> #pragma GCC target("sse4a")

This is certainly fine, but avx512f in there isn't needed, that is implied by
avx512cd.
Though, I don't see anything avx512cd nor sse4a-ish in there.
> 
> void util_fadd_512(float *a, float *b, float *c) {
>     /* a = b + c */
>     __m512 av = _mm512_load_ps(a);
>     __m512 bv = _mm512_load_ps(b);
>     __m512 cv = _mm512_add_ps(av, bv);
>     _mm512_store_ps(c, cv);
> }
> static inline int
> util_iround(float f)
> {
>    __m128 m = _mm_set_ss(f);
>    return _mm_cvtss_i32(m);
> }
> 
> #pragma GCC pop_options
> 
> int util_iround_outside(int x, float y) {
>     return x + util_iround(y);
> }
> float util_fadd(float a, float b) {
>    return a + b;
> }
> ```

That said, code with avx512cd etc. target won't inline into code without it.

Reply via email to