On Tue, Apr 16, 2019 at 11:41 PM H.J. Lu <hjl.to...@gmail.com> wrote: > > On Tue, Apr 16, 2019 at 8:36 AM Martin Liška <mli...@suse.cz> wrote: > > > > On 4/16/19 4:50 PM, H.J. Lu wrote: > > > On Tue, Apr 16, 2019 at 1:28 AM Martin Liška <mli...@suse.cz> wrote: > > >> > > >> On 4/15/19 5:09 PM, H.J. Lu wrote: > > >>> On Mon, Apr 15, 2019 at 12:26 AM Martin Liška <mli...@suse.cz> wrote: > > >>>> > > >>>> On 4/12/19 4:12 PM, H.J. Lu wrote: > > >>>>> On Fri, Apr 12, 2019 at 4:41 AM Martin Liška <mli...@suse.cz> wrote: > > >>>>>> > > >>>>>> On 4/11/19 6:30 PM, H.J. Lu wrote: > > >>>>>>> On Thu, Apr 11, 2019 at 1:38 AM Martin Liška <mli...@suse.cz> wrote: > > >>>>>>>> > > >>>>>>>> Hi. > > >>>>>>>> > > >>>>>>>> The patch is adding missing AVX512 ISAs for target and target_clone > > >>>>>>>> attributes. > > >>>>>>>> > > >>>>>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression > > >>>>>>>> tests. > > >>>>>>>> > > >>>>>>>> Ready to be installed? > > >>>>>>>> Thanks, > > >>>>>>>> Martin > > >>>>>>>> > > >>>>>>>> gcc/ChangeLog: > > >>>>>>>> > > >>>>>>>> 2019-04-10 Martin Liska <mli...@suse.cz> > > >>>>>>>> > > >>>>>>>> PR target/89929 > > >>>>>>>> * config/i386/i386.c (get_builtin_code_for_version): Add > > >>>>>>>> support for missing AVX512 ISAs. > > >>>>>>>> > > >>>>>>>> gcc/testsuite/ChangeLog: > > >>>>>>>> > > >>>>>>>> 2019-04-10 Martin Liska <mli...@suse.cz> > > >>>>>>>> > > >>>>>>>> PR target/89929 > > >>>>>>>> * g++.target/i386/mv28.C: New test. > > >>>>>>>> * gcc.target/i386/mvc14.c: New test. > > >>>>>>>> --- > > >>>>>>>> gcc/config/i386/i386.c | 34 > > >>>>>>>> ++++++++++++++++++++++++++- > > >>>>>>>> gcc/testsuite/g++.target/i386/mv28.C | 30 +++++++++++++++++++++++ > > >>>>>>>> gcc/testsuite/gcc.target/i386/mvc14.c | 16 +++++++++++++ > > >>>>>>>> 3 files changed, 79 insertions(+), 1 deletion(-) > > >>>>>>>> create mode 100644 gcc/testsuite/g++.target/i386/mv28.C > > >>>>>>>> create mode 100644 gcc/testsuite/gcc.target/i386/mvc14.c > > >>>>>>>> > > >>>>>>>> > > >>>>>>> > > >>>>>> > > >>>>>> Hi. > > >>>>>> > > >>>>>>> Since any ISAs beyond AVX512F may be enabled individually, we > > >>>>>>> can't simply assign priorities to them. For GFNI, we can have > > >>>>>>> > > >>>>>>> 1. GFNI > > >>>>>>> 2. GFNI + AVX > > >>>>>>> 3. GFNI + AVX512F > > >>>>>>> 4. GFNI + AVX512F + AVX512VL > > >>>>>> > > >>>>>> Makes sense to me! I'm considering syntax extension where one would > > >>>>>> be > > >>>>>> able to come up with a priority. Eg. > > >>>>>> > > >>>>>> __attribute__((target("gfni,avx512bw", priority((3))))) > > >>>>>> > > >>>>>> Without that the ISA combinations are probably not comparable in a > > >>>>>> reasonable way. > > >>>>>> > > >>>>>>> > > >>>>>>> For this code, GFNI + AVX512BW is ignored: > > >>>>>>> > > >>>>>>> [hjl@gnu-cfl-1 pr89929]$ cat z.ii > > >>>>>>> __attribute__((target("gfni"))) > > >>>>>>> int foo(int i) { > > >>>>>>> return 1; > > >>>>>>> } > > >>>>>>> __attribute__((target("gfni,avx512bw"))) > > >>>>>>> int foo(int i) { > > >>>>>>> return 4; > > >>>>>>> } > > >>>>>>> __attribute__((target("default"))) > > >>>>>>> int foo(int i) { > > >>>>>>> return 3; > > >>>>>>> } > > >>>>>>> int bar () > > >>>>>>> { > > >>>>>>> return foo(2); > > >>>>>>> } > > >>>>>> > > >>>>>> For 'target' attribute it works for me: > > >>>>>> > > >>>>>> 1) $ cat z.c && ./xg++ -B. z.c -c > > >>>>>> #include <x86intrin.h> > > >>>>>> volatile __m512i x1, x2; > > >>>>>> volatile __mmask64 m64; > > >>>>>> > > >>>>>> __attribute__((target("gfni"))) > > >>>>>> int foo(int i) { > > >>>>>> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); > > >>>>>> return 1; > > >>>>>> } > > >>>>>> __attribute__((target("gfni,avx512bw"))) > > >>>>>> int foo(int i) { > > >>>>>> return 4; > > >>>>>> } > > >>>>>> __attribute__((target("default"))) > > >>>>>> int foo(int i) { > > >>>>>> return 3; > > >>>>>> } > > >>>>>> int bar () > > >>>>>> { > > >>>>>> return foo(2); > > >>>>>> } > > >>>>>> In file included from ./include/immintrin.h:117, > > >>>>>> from ./include/x86intrin.h:32, > > >>>>>> from z.c:1: > > >>>>>> z.c: In function ‘int foo(int)’: > > >>>>>> z.c:7:10: error: ‘__builtin_ia32_vgf2p8affineinvqb_v64qi’ needs isa > > >>>>>> option -m32 -mgfni -mavx512f > > >>>>>> 7 | x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); > > >>>>>> | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > > >>>>>> z.c:7:10: note: the ABI for passing parameters with 64-byte > > >>>>>> alignment has changed in GCC 4.6 > > >>>>>> > > >>>>>> 2) $ cat z.c && ./xg++ -B. z.c -c > > >>>>>> #include <x86intrin.h> > > >>>>>> volatile __m512i x1, x2; > > >>>>>> volatile __mmask64 m64; > > >>>>>> > > >>>>>> __attribute__((target("gfni"))) > > >>>>>> int foo(int i) { > > >>>>>> return 1; > > >>>>>> } > > >>>>>> __attribute__((target("gfni,avx512bw"))) > > >>>>>> int foo(int i) { > > >>>>>> x1 = _mm512_gf2p8affineinv_epi64_epi8(x1, x2, 3); > > >>>>>> return 4; > > >>>>>> } > > >>>>>> __attribute__((target("default"))) > > >>>>>> int foo(int i) { > > >>>>>> return 3; > > >>>>>> } > > >>>>>> int bar () > > >>>>>> { > > >>>>>> return foo(2); > > >>>>>> } > > >>>>>> > > >>>>>> [OK] > > >>>>>> > > >>>>>> Btw. is it really correct the '-m32' in: 'needs isa option -m32' ? > > >>>>> > > >>>>> It does look odd. > > >>>> > > >>>> Then let me take a look at this. > > >>>> > > >>>>> > > >>>>>> Similar applies to target_clone attribute where we'll have to come > > >>>>>> up with > > >>>>>> a syntax that will allow multiple ISA to be combined. Something like: > > >>>>>> > > >>>>>> __attribute__((target_clones("gfni+avx512bw"))) > > >>>>>> > > >>>>>> ? Priorities can be maybe implemented by order? > > >>>>>> > > >>>>> > > >>>>> I am thinking -misa=processor which will enable ISAs for > > >>>>> processor. It differs from -march=. -misa= doesn't set > > >>>>> -mtune. > > >>>>> > > >>>> > > >>>> Well, isn't that what we currently support, e.g.: > > >>>> > > >>>> $ cat mvc11.c && gcc mvc11.c -c > > >>>> __attribute__((target_clones("arch=sandybridge", "arch=cascadelake", > > >>>> "default"))) int > > >>>> foo (void) > > >>>> { > > >>>> return 0; > > >>>> } > > >>>> > > >>>> int > > >>>> main () > > >>>> { > > >>>> foo (); > > >>>> } > > >>>> > > >>>> If so, we can provide a new warning that will tell that for AVX512* on > > >>>> should use 'arch=xyz' > > >>>> instead? > > >>>> > > >>> > > >>> 1. We don't have one option to enable AVX512F and AVX512CD, which are > > >>> common to > > >>> Skylake server and KNL. > > >> > > >> Then arch=axy is the solution for that. > > >> > > >>> 2. -march= also implies -mtune, which may not be wanted. > > >> > > >> Why is that not intended. Using target (or target_clone), I would expect > > >> the > > >> fastest possible version of the function for defined ISA (or ARCH). Then > > >> implying -mtune is reasonable for me. > > >> > > >>> 3. -march=icelake-client enables more than just AVX512XXX. > > >> > > >> Yes, but if you take a look at number of AVX512 ISA extensions (18), then > > >> it looks more easier to me to use arch=xyz as it leads to reasonable > > >> number > > >> of architectures. > > >> > > > > > > Are you proposing to add a new set of -march=XXX to support the family > > > of processors with AVX512? Adding -march=XXX is much more invasive > > > that adding -misa=XXX, which is just a combination of -mavx512XXXs. > > > Do we really want to do that? > > > > > > > No, I'm saying that I would reject all these: > > > > + {"avx512f", P_AVX512F}, > > + {"avx512vl", P_AVX512VL}, > > + {"avx512bw", P_AVX512BW}, > > + {"avx512dq", P_AVX512DQ}, > > + {"avx512cd", P_AVX512CD}, > > + {"avx512er", P_AVX512ER}, > > + {"avx512pf", P_AVX512PF}, > > + {"avx512vbmi", P_AVX512VBMI}, > > + {"avx512ifma", P_AVX512IFMA}, > > + {"avx5124vnniw", P_AVX5124VNNIW}, > > + {"avx5124fmaps", P_AVX5124FMAPS}, > > + {"avx512vpopcntdq", P_AVX512VPOPCNTDQ}, > > + {"avx512vbmi2", P_AVX512VBMI2}, > > + {"gfni", P_GFNI}, > > + {"vpclmulqdq", P_VPCLMULQDQ}, > > + {"avx512vnni", P_AVX512VNNI}, > > + {"avx512bitalg", P_AVX512BITALG} > > Works for me. > > > as arguments to target_clone attribute and instead I would recommend to > > users to use > > target_clones(arch=skylake,arch=skylake-avx512,arch=cannonlake,arch=icelake-client,arch=icelake-server, > > ..) > > Please check with bug reporter to verify that this solves his problem. > > > It's because we don't have a syntax for combination of ISAs for a > > target_clones attribute (e.g. avx512f+avx512cd+avx512er) > > and we don't have a syntax for priorities. > > > > Thanks. > > -- > H.J.
Any other comments, I'll merge this to trunk? -- BR, Hongtao