Attached is fixed patch. Sebastian
> -----Original Message----- > From: H.J. Lu [mailto:hjl.to...@gmail.com] > Sent: Tuesday, November 14, 2017 1:18 PM > To: Peryt, Sebastian <sebastian.pe...@intel.com> > Cc: Jakub Jelinek <ja...@redhat.com>; gcc-patches@gcc.gnu.org; Uros Bizjak > <ubiz...@gmail.com>; Kirill Yukhin <kirill.yuk...@gmail.com>; Lu, Hongjiu > <hongjiu...@intel.com> > Subject: Re: [PATCH][i386] PR82941/PR82942 - Adding vzeroupper generation > for SKX > > On Tue, Nov 14, 2017 at 3:18 AM, Peryt, Sebastian <sebastian.pe...@intel.com> > wrote: > > I have updated tests and changelog according to Jakub's suggestions. > > Please find attached v2 of my patch. > > > > > > 14.11.2017 Sebastian Peryt <sebastian.pe...@intel.com> > > > > gcc/ > > > > PR target/82941 > > PR target/82942 > > * config/i386/i386.c (pass_insert_vzeroupper): Modify gate condition > > to return true on Xeon and not on Xeon Phi. > > (ix86_check_avx256_register): Changed to ... > > (ix86_check_avx_upper_register): ... this. Add extra check for > > VALID_AVX512F_REG_OR_XI_MODE. > > (ix86_avx_u128_mode_needed): Changed > > ix86_check_avx256_register to ix86_check_avx_upper_register. > > (ix86_check_avx256_stores): Changed to ... > > (ix86_check_avx_upper_stores): ... this. Changed > > ix86_check_avx256_register to ix86_check_avx_upper_register. > > (ix86_avx_u128_mode_after): Changed > > avx_reg256_found to avx_upper_reg_found. Changed > > ix86_check_avx256_stores to ix86_check_avx_upper_stores. > > (ix86_avx_u128_mode_entry): Changed > > ix86_check_avx256_register to ix86_check_avx_upper_register. > > (ix86_avx_u128_mode_exit): Ditto. > > * config/i386/i386.h: (host_detect_local_cpu): New define. > > @@ -2497,7 +2497,7 @@ public: > /* opt_pass methods: */ > virtual bool gate (function *) > { > - return TARGET_AVX && !TARGET_AVX512F > + return TARGET_AVX && !TARGET_AVX512PF && !TARGET_AVX512ER > ^^^^^^^^^^^^^^^^^^^^^ Please > remove this. > > From glibc commit: > > commit 4cb334c4d6249686653137ec273d081371b3672d > Author: H.J. Lu <hjl.to...@gmail.com> > Date: Tue Apr 18 14:01:45 2017 -0700 > > x86: Use AVX2 memcpy/memset on Skylake server [BZ #21396] > > On Skylake server, AVX512 load/store instructions in memcpy/memset may > lead to lower CPU turbo frequency in certain situations. Use of AVX2 > in memcpy/memset has been observed to have improved overall performance > in many workloads due to the higher frequency. > > Since AVX512ER is unique to Xeon Phi, this patch sets Prefer_No_AVX512 > if AVX512ER isn't available so that AVX2 versions of memcpy/memset are > used on Skylake server. > > Only AVX512ER is really unique to Xeon Phi. > > && TARGET_VZEROUPPER && flag_expensive_optimizations > && !optimize_size; > } > > > 14.11.2017 Sebastian Peryt <sebastian.pe...@intel.com> > > > > gcc/testsuite/ > > > > PR target/82941 > > PR target/82942 > > * gcc.target/i386/pr82941-1.c: New test. > > * gcc.target/i386/pr82941-2.c: New test. > > * gcc.target/i386/pr82942-1.c: New test. > > * gcc.target/i386/pr82942-2.c: New test. > > > > > > Thanks, > > Sebastian > > > >> -----Original Message----- > >> From: Jakub Jelinek [mailto:ja...@redhat.com] > >> Sent: Tuesday, November 14, 2017 10:51 AM > >> To: Peryt, Sebastian <sebastian.pe...@intel.com> > >> Cc: gcc-patches@gcc.gnu.org; Uros Bizjak <ubiz...@gmail.com>; Kirill > >> Yukhin <kirill.yuk...@gmail.com>; Lu, Hongjiu <hongjiu...@intel.com> > >> Subject: Re: [PATCH][i386] PR82941/PR82942 - Adding vzeroupper > >> generation for SKX > >> > >> On Tue, Nov 14, 2017 at 09:45:12AM +0000, Peryt, Sebastian wrote: > >> > Hi, > >> > > >> > This patch fixes PR82941 and PR82942 by adding vzeroupper > >> > generation on > >> SKX. > >> > Bootstrapped and tested. > >> > > >> > 14.11.2017 Sebastian Peryt <sebastian.pe...@intel.com> > >> > > >> > gcc/ > >> > >> In that case the ChangeLog entry should list the PRs, i.e. > >> PR target/82941 > >> PR target/82942 > >> > * config/i386/i386.c (pass_insert_vzeroupper): Modify gate condition > >> > to return true on Xeon and not on Xeon Phi. > >> > (ix86_check_avx256_register): Changed to ... > >> > (ix86_check_avx_upper_register): ... this. > >> > (ix86_check_avx_upper_register): Add extra check for > >> > VALID_AVX512F_REG_OR_XI_MODE. > >> > >> The way this is usually written is instead: > >> (ix86_check_avx256_register): Changed to ... > >> (ix86_check_avx_upper_register): ... this. Add extra check for > >> VALID_AVX512F_REG_OR_XI_MODE. > >> i.e. don't duplicate the function name, just continue mentioning further > changes. > >> > >> > (ix86_avx_u128_mode_needed): Changed > >> > ix86_check_avx256_register to ix86_check_avx_upper_register. > >> > (ix86_check_avx256_stores): Changed to ... > >> > (ix86_check_avx_upper_stores): ... this. > >> > (ix86_check_avx_upper_stores): Changed > >> > ix86_check_avx256_register to ix86_check_avx_upper_register. > >> > >> Likewise. > >> > >> > gcc/testsuite/ > >> > * gcc.target/i386/pr82941.c: New test. > >> > * gcc.target/i386/pr82942.c: New test. > >> > >> Shouldn't there be also a test that if using -march=knl and another > >> one if using - mavx512f -mavx512er that we don't emit any vzeroupper? > >> > >> Jakub > > > > -- > H.J.
0001-VZEROUPPER_v3.patch
Description: 0001-VZEROUPPER_v3.patch