Request to backport two -mvzeroupper related patches to 4.6 branch

2011-06-28 Thread Fang, Changpeng
Hi, Attached are two patches in gcc 4.7 trunk that we request to backport to 4.6 branch. There are all related to -mvzerupper 1) 0001-Save-the-initial-options-after-checking-vzeroupper.patch This patch fixes bug 47315, ICE: in extract_insn, at recog.c:2109 (unrecognizable insn) with -mvzeroupp

RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

2011-06-28 Thread Fang, Changpeng
Hi, I re-attached the patch here. Can someone review it? We would like to commit to trunk as well as 4.6 branch. Thanks, Changpeng From: Fang, Changpeng Sent: Monday, June 27, 2011 5:42 PM To: Fang, Changpeng; Jan Hubicka Cc: Uros Bizjak; gcc

RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

2011-06-27 Thread Fang, Changpeng
Is this patch OK to commit to trunk? Also I would like to backport this patch to gcc 4.6 branch. Do I have to send a separate request or use this one? Thanks, Changpeng From: Fang, Changpeng Sent: Friday, June 24, 2011 7:12 PM To: Jan Hubicka Cc

RE: Backport AVX256 load/store split patches to gcc 4.6 for performance boost on latest AMD/Intel hardware.

2011-06-27 Thread Fang, Changpeng
:03 PM To: 'H.J. Lu' Cc: 'gcc-patches@gcc.gnu.org'; 'hubi...@ucw.cz'; 'ubiz...@gmail.com'; 'hongjiu...@intel.com'; Fang, Changpeng Subject: RE: Backport AVX256 load/store split patches to gcc 4.6 for performance boost on latest AMD/Intel hardware.

RE: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

2011-06-24 Thread Fang, Changpeng
[hubi...@ucw.cz] Sent: Thursday, June 23, 2011 6:20 PM To: Fang, Changpeng Cc: Uros Bizjak; gcc-patches@gcc.gnu.org; hubi...@ucw.cz; rguent...@suse.de Subject: Re: [PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer Hi, > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c &

[PATCH, i386] Enable -mprefer-avx128 by default for Bulldozer

2011-06-23 Thread Fang, Changpeng
Hi, This patch enables 128-bit avx instruction generation for the auto-vectorizer for AMD bulldozer machines. This enablement gives additional ~3% improvement on polyhedron 2005 and cpu2006 floating point programs. The patch passed bootstrapping on a x86_64-unknown-linux-gnu system with Bulld

RE: Backport AVX256 load/store split patches to gcc 4.6 for performance boost on latest AMD/Intel hardware.

2011-06-20 Thread Fang, Changpeng
'ubiz...@gmail.com'; 'hongjiu...@intel.com'; Fang, Changpeng Subject: RE: Backport AVX256 load/store split patches to gcc 4.6 for performance boost on latest AMD/Intel hardware. > On Mon, Jun 20, 2011 at 9:58 AM, wrote: > > Is it ok to backport patches, with Changel

RE: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

2011-06-20 Thread Fang, Changpeng
Thanks, Patch has been committed to trunk as revision 175230. Changpeng From: Uros Bizjak [ubiz...@gmail.com] Sent: Monday, June 20, 2011 1:38 PM To: Fang, Changpeng Cc: H.J. Lu; gcc-patches@gcc.gnu.org; hubi...@ucw.cz; rguent...@suse.de Subject: Re

RE: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

2011-06-20 Thread Fang, Changpeng
Hi, I modified the patch as H.J. suggested (patch attached). Is it OK to commit to trunk now? Thanks, Changpeng From: H.J. Lu [hjl.to...@gmail.com] Sent: Friday, June 17, 2011 5:44 PM To: Fang, Changpeng Cc: Richard Guenther; gcc-patches@gcc.gnu.org

RE: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

2011-06-17 Thread Fang, Changpeng
, Changpeng Cc: Richard Guenther; gcc-patches@gcc.gnu.org Subject: Re: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic On Fri, Jun 17, 2011 at 10:45 AM, Fang, Changpeng wrote: >>Why not just move AVX256_SPLIT_UNALIGNED_STORE >>and AVX256_SPLIT_

RE: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

2011-06-17 Thread Fang, Changpeng
>Why not just move AVX256_SPLIT_UNALIGNED_STORE >and AVX256_SPLIT_UNALIGNED_LOAD to ix86_tune_indices? I would like to keep the -m option so that at least we can explicitly turn off the splittings when regressions are found! By the way, I can add an index for store splitting, if you want. Thanks

RE: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

2011-06-16 Thread Fang, Changpeng
Hi, I modify the patch to disable unaligned load splitting only for bdver1 at this moment. Unaligned load splitting degrades CFP2006 by 1.3% in geomean for both -mtune=bdver1 and -mtune=generic on Bulldozer. However, we agree with H.J's suggestion to determine the optimal optimization sets fo

RE: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

2011-06-15 Thread Fang, Changpeng
>I have no problems on -mtune=Bulldozer. But I object -mtune=generic >change and did suggest a different approach for -mtune=generic. Something must have been broken for the unaligned load splitting in generic mode. While we lose 1.3% on CFP2006 in geomean by splitting unaligned loads for -mtu

RE: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

2011-06-14 Thread Fang, Changpeng
> > So, is it OK to commit this patch to trunk, and H.J's original patch + this > to 4.6 branch? >I have no problems on -mtune=Bulldozer. But I object -mtune=generic >change and did suggest a different approach for -mtune=generic. What's your suggested approach for -mtune=generic? My underst

RE: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

2011-06-14 Thread Fang, Changpeng
...@gmail.com] Sent: Tuesday, June 14, 2011 8:05 AM To: Jakub Jelinek; sergos@gmail.com Cc: Richard Guenther; Fang, Changpeng; gcc-patches@gcc.gnu.org Subject: Re: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic On Tue, Jun 14, 2011 at 3:16 AM, Jakub Jelinek

RE: [PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

2011-06-14 Thread Fang, Changpeng
>It probably should go to the 4.6 branch as well. H.J. Lu's original patch that splits unaligned load and store was checked in gcc 4.7 trunk. We found that, splitting unaligned store is beneficial to bdver1, splitting unaligned load degrades cfp2006 by 1.3% in geomean on Bulldozer. As a result,

[PATCH, PR 49089] Don't split AVX256 unaligned loads by default on bdver1 and generic

2011-06-13 Thread Fang, Changpeng
Hi, The patch ( http://gcc.gnu.org/ml/gcc-patches/2011-02/txt00059.txt ) which introduces splitting avx256 unaligned loads. However, we found that it causes significant regressions for cpu2006 ( http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49089 ). In this work, we introduce a tune option that

RE: [PATCH, i386] Introduce a flag to generate only 128-bit avx instructions

2011-03-03 Thread Fang, Changpeng
Yes, you are right. I renamed the flag to -mprefers-avx128 and modified the documentation. Is this OK to commit to 4.6? Thanks, Changpeng From: Richard Henderson [r...@redhat.com] Sent: Wednesday, March 02, 2011 3:50 PM To: Fang, Changpeng Cc: Jakub