RE: [PATCH] Fix bdverN vector cost of cond_[not_]taken_branch_cost

2015-04-12 Thread Gopalasubramanian, Ganesh
all. We will have a look into it. Regards Ganesh -Original Message- From: Richard Biener [mailto:rguent...@suse.de] Sent: Wednesday, April 08, 2015 1:08 PM To: Gopalasubramanian, Ganesh Cc: Uros Bizjak; gcc-patches@gcc.gnu.org Subject: RE: [PATCH] Fix bdverN vector cost of

RE: [PATCH] Fix bdverN vector cost of cond_[not_]taken_branch_cost

2015-04-07 Thread Gopalasubramanian, Ganesh
> I have added a person from AMD to comment on the decision. > Otherwise, the patch looks OK, but please wait a couple of days for possible > comments. Thank you Uros! I am checking the changes with few tests and benchmarking them. Please wait for a couple of days. -Ganesh

RE: [PATCH] Rename gimple_build_assign_with_ops to gimple_build_assign and swap the first two arguments of it

2014-12-01 Thread Gopalasubramanian, Ganesh
The following patch implements that. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? Our aarch64 build also breaks as mentioned in https://gcc.gnu.org/ml/gcc-patches/2014-12/msg00119.html Regards Ganesh

RE: [PATCH, aarch64] Add prefetch support

2014-11-30 Thread Gopalasubramanian, Ganesh
Please ignore the previous patch sent. The attachment was wrong. > There's no point in the buffer or the sprintf. > The text is short enough to repeat whole pattern in the array: Updated the patch for the above suggestions. make -k check RUNTESTFLAGS="execute.exp compile.exp dg.exp" passes. Is i

RE: [PATCH, aarch64] Add prefetch support

2014-11-30 Thread Gopalasubramanian, Ganesh
> There's no point in the buffer or the sprintf. > The text is short enough to repeat whole pattern in the array: Updated the patch for the above suggestions. Is it ok for upstream? Regards Ganesh prefetch.diff Description: prefetch.diff

RE: [PATCH, aarch64] Add prefetch support

2014-11-14 Thread Gopalasubramanian, Ganesh
> For this prefetch patch I suggest we go with the existing "load1". I have removed the changes done in types.md. > The inline patch has been munged by your mailer, I tried applying the patch > to my tree but it is full of escape sequences. Can you either fix your > mailer or submit patches as

FW: [PATCH, aarch64] Add prefetch support

2014-11-10 Thread Gopalasubramanian, Ganesh
PING! I am worried if it goes in stage-1. -Original Message- From: Gopalasubramanian, Ganesh Sent: Thursday, October 30, 2014 2:24 PM To: gcc-patches@gcc.gnu.org Subject: [PATCH, aarch64] Add prefetch support Hi, Below is the patch that implements prefetching support. This patch has

[PATCH, aarch64] Add prefetch support

2014-10-30 Thread Gopalasubramanian, Ganesh
Hi, Below is the patch that implements prefetching support. This patch has been already discussed on a) https://gcc.gnu.org/ml/gcc-patches/2014-02/msg01644.html b) https://gcc.gnu.org/ml/gcc-patches/2014-07/msg00612.html I have not added a test as there are ample tests in compile and execute su

RE: RFA: another patch to fix PR61360

2014-09-24 Thread Gopalasubramanian, Ganesh
>The "r->x" alternative results in "vector" decoding on amdfam10. This is >AMD-speak for microcoded instructions, and AMD optimization manual strongly >recommends avoiding them. I have CC'd Ganesh, maybe he >can provide more >relevant data on the performance impact. Thanks Uros! Yes, the AMD S

[PATCH, i386] PR61360: Do not update "enabled" attribute during lra and reload passes

2014-08-22 Thread Gopalasubramanian, Ganesh
This patch fixes PR 61360. The attribute "enabled" should actually be used enable/disable alternative based on sub-targets. In this pattern, it gets used across passes too. However, modifying this attribute in LRA pass is not something it is meant for. This patch allows enabling/disabling the at

RE: [PATCH, i386] Remove use of vpmacsdql instruction from multiplication.

2014-08-11 Thread Gopalasubramanian, Ganesh
Hi Uros! > > +2014-06-10 Ganesh Gopalasubramanian > > + > > + > > + * config/i386/i386.c (ix86_expand_sse2_mulvxdi3): Issue > > +instructions "vpmuludq" and "vpaddq" instead of "vpmacsdql" for > > +handling 32-bit multiplication. > > > OK for mainline and release branches. I would like

RE: [PATCH, i386] Add RDRND and MOVBE for AMD bdver4

2014-08-08 Thread Gopalasubramanian, Ganesh
> OK for mainline. Thanks Uros. Committed to revision 213572 I would like to backport to 4.9 branch too. Is it OK? - Ganesh

[PATCH, i386] Add RDRND and MOVBE for AMD bdver4

2014-08-04 Thread Gopalasubramanian, Ganesh
Below patch adds PTA_RDRND and PTA_MOVBE for bdver4. Bootstrap passes. Ok for upstream? Regards Ganesh Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 213568) +++ gcc/ChangeLog (working copy) @@ -24,6 +24,11 @@ 20

RE: [PATCH, i386] Handle extended family cpuid info for AMD

2014-08-01 Thread Gopalasubramanian, Ganesh
> In this case, having only check for family ID should be enough. If >BTVER1 and BTVER2 can be uniquely determined by their family IDs , >IMO, this would be the most future-proof approach. Signature checks will >override family id checks which will override cpuid checks. Thank you Uros! I have

RE: [PATCH, i386] Handle extended family cpuid info for AMD

2014-07-31 Thread Gopalasubramanian, Ganesh
Uros! > I would like to have a check for a family at the beginning, something like: > if (name == signature_NSC_ebx) >processor = PROCESSOR_GEODE; > else if (family == 22) >{ > if (has_movbe) I get your idea of having the family checked first and then differentiating with

RE: [PATCH, i386] Handle extended family cpuid info for AMD

2014-07-31 Thread Gopalasubramanian, Ganesh
> Then just use: > + else if (has_avx2) > +processor = PROCESSOR_BDVER4; > else if (has_movbe) >processor = PROCESSOR_BTVER2; >- else if (has_avx2) >-processor = PROCESSOR_BDVER4; > else if (has_xsaveopt) In that case, with earlier GCC versions where w

RE: [PATCH, i386] Handle extended family cpuid info for AMD

2014-07-31 Thread Gopalasubramanian, Ganesh
> But, looking to processor_alias_table in config/i386/i386.c, only > PROCESSOR_BTVER2 defines PTA_MOVBE. According to this, the logic is already > correct, so the patch is not needed. We are evaluating bdver4 cpu. Bdver4 also supports MOVBE. I will submit patch for bdver4 PTA after our evaluati

[PATCH, i386] Handle extended family cpuid info for AMD

2014-07-31 Thread Gopalasubramanian, Ganesh
Hi, The below patch handles the AMD's cpuid family information. With the information from cpuid, BTVER2 cpu for -march=native flag is handled. Bootstrap passes. Is it OK for trunk and branches? Regards Ganesh diff --git a/gcc/ChangeLog b/gcc/ChangeLog index 6223bd6..3f8bb2c 100644 --- a/gcc/Ch

FW: [PATCH, aarch64] Add prefetch support

2014-07-08 Thread Gopalasubramanian, Ganesh
PING! -Original Message- From: Gopalasubramanian, Ganesh Sent: Sunday, July 06, 2014 2:12 AM To: gcc-patches@gcc.gnu.org Cc: marcus.shawcr...@arm.com; richard.earns...@arm.com Subject: RE: [PATCH, aarch64] Add prefetch support PING! From

RE: [PATCH, aarch64] Add prefetch support

2014-07-05 Thread Gopalasubramanian, Ganesh
PING! From: Gopalasubramanian, Ganesh Sent: Friday, July 04, 2014 5:57 AM To: gcc-patches@gcc.gnu.org Cc: marcus.shawcr...@arm.com; richard.earns...@arm.com Subject: [PATCH, aarch64] Add prefetch support Hi, Attached is a patch that implements

[PATCH, aarch64] Add prefetch support

2014-07-04 Thread Gopalasubramanian, Ganesh
Hi, Attached is a patch that implements * Prefetch with immediate offset in the range 0 to 32760 (multiple of 8). Added a predicate for this. * Prefetch with immediate offset - in the range -256 to 255 (Gets generated only when we have a negative offset. Generates prfum instruction).

[PATCH, i386] Remove use of vpmacsdql instruction from multiplication.

2014-06-10 Thread Gopalasubramanian, Ganesh
Hi, The below patch fixes the issue with 64-bit multiplication. The instruction "vpmacsdql" does signed 32-bit multiplication. For V2DImode, we require widened unsigned multiplication. So, replacing the "vpmacsdql" instruction with "vpmuludq" and "vpaddq". This patch had been already discussed in

RE: [AArch64 05/14] Add AArch64 'prefetch'-pattern.

2014-05-28 Thread Gopalasubramanian, Ganesh
o:philipp.toms...@theobroma-systems.com] Sent: Friday, February 28, 2014 2:58 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org; pins...@gmail.com Subject: Re: [AArch64 05/14] Add AArch64 'prefetch'-pattern. Ganesh, On 28 Feb 2014, at 10:13 , Gopalasubramanian, Ganesh wrote: > I

RE: [AArch64 05/14] Add AArch64 'prefetch'-pattern.

2014-02-28 Thread Gopalasubramanian, Ganesh
Avoided top-posting and resending. + /* temporal locality */ + return (INTVAL(operands[1])) ? \"prfm\\tPSTL1KEEP, [%0, #0]\" : +\"prfm\\tPLDL1KEEP, [%0, #0]\"; }" + [(set_attr "type" "prefetch")] +) + With the locality value received in the instruction pattern, I think it would be safe to ha

RE: [AArch64 05/14] Add AArch64 'prefetch'-pattern.

2014-02-28 Thread Gopalasubramanian, Ganesh
With the locality value received in the instruction pattern, I think it would be safe to handle them in prefetch instruction. This helps especially AArch64 has prefetch instructions that can handle this locality. +(define_insn "prefetch" + [(prefetch (match_operand:DI 0 "address_operand" "r") +

FW: Non-temporal move

2014-02-24 Thread Gopalasubramanian, Ganesh
I could see "storent" pattern in x86 machine descriptions (in sse.md)., but internals doc don't mention it. Should we add a description about this in the internals doc? Regards Ganesh

RE: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-26 Thread Gopalasubramanian, Ganesh
> I'm sorry I didn't notice previous conversation. Please install ASAP. Thanks Uros! Committed to revision 206210. - Ganesh

RE: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-25 Thread Gopalasubramanian, Ganesh
Hi, >> (get_amd_cpu): Handle AMD_BOBCAT, AMD_JAGUAR, AMDFAM15H_BDVER2 and >> AMDFAM15H_BDVER3. As mentioned earlier, we would like to stick with BTVER1 and BTVER2 instead of using BOBCAT or JAGUAR. Attached patch does the changes. Regards Ganesh NameChange.patch Description: NameChang

RE: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-19 Thread Gopalasubramanian, Ganesh
> Sorry, I must have been looking at an older version, but as I said I already > did enable it in the latest patch. (see > http://gcc.gnu.org/ml/gcc-patches/2013-12/msg01577.html ) Sorry for causing another revision but we would like to stick with "btver1" and "btver2" rather than "BOBCAT" or "

RE: [RFC] [PATCH, i386] Adjust unroll factor for bdver3 and bdver4

2013-12-19 Thread Gopalasubramanian, Ganesh
> Please provide updated ChangeLog. --- gcc/ChangeLog (revision 206106) +++ gcc/ChangeLog (working copy) @@ -1,3 +1,14 @@ +2013-12-19 Ganesh Gopalasubramanian + + * config/i386/i386.c: Include cfgloop.h. + (ix86_loop_memcount): New function. + (ix86_loop_unroll_ad

RE: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-18 Thread Gopalasubramanian, Ganesh
> Yes, I changed that in the last patch, though I consider it momentarily > problematic because you do not yet enable AVX with march=btver2 (AVX versions > would currently be better than btver2 versions for a btver2 arch), but expect march=btver2 will be fixed soon. The " processor_alias_table"

RE: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-18 Thread Gopalasubramanian, Ganesh
Ping! "Gopalasubramanian, Ganesh" wrote: > Yes, I figured that was the original idea behind it, but the final family of > the jaguar processors seems to have become 16h instead of 14h (bobcat) at > some point. Yes. It is amdfam16h. I was supposed to pass on some comme

RE: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-16 Thread Gopalasubramanian, Ganesh
> Yes, I figured that was the original idea behind it, but the final family of > the jaguar processors seems to have become 16h instead of 14h (bobcat) at > some point. Yes. It is amdfam16h. I was supposed to pass on some comments on the patch. 1. Amdfam16h for Jaguar. 2. For Jaguar, the priorit

RE: [Patch, i386] PR 59422 - Support more targets for function multi versioning

2013-12-16 Thread Gopalasubramanian, Ganesh
> Btw, I couldn't find anything that corresponds to gcc's btver2 arch. Is that > an old term for what has become the Jaguar architecture? Yes, "btver2" = "jaguar". We have the name as per its family name (i.e, bobcat family) in GCC. Similarly we have the names "bdver2" = "piledriver", "bdver3"

RE: [RFC] [PATCH, i386] Adjust unroll factor for bdver3 and bdver4

2013-12-11 Thread Gopalasubramanian, Ganesh
Hi Uros! Accommodated the changes that you mentioned. Completed the bootstrap testing too. Regards Ganesh -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Wednesday, December 04, 2013 3:17 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org; Richard

[patch][wwwdocs] gcc 4.9 changes - AMD new cores

2013-12-05 Thread Gopalasubramanian, Ganesh
Hello, This patch adds details about new AMD cores that got enabled in GCC-4.9. OK for the wwwdocs? Regards Ganesh cvs diff: Diffing . Index: changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.9/changes.html,v retrieving rev

RE: [RFC] [PATCH, i386] Adjust unroll factor for bdver3 and bdver4

2013-12-04 Thread Gopalasubramanian, Ganesh
single step. (I think I missed my step here ;) ) Regards Ganesh -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Wednesday, December 04, 2013 3:17 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org; Richard Guenther (richard.guent...@gmail.com) Subjec

RE: [RFC] [PATCH, i386] Adjust unroll factor for bdver3 and bdver4

2013-12-04 Thread Gopalasubramanian, Ganesh
ive insns. Since every rtx in the insn is checked for memory references, the NULL_RTX check is required. Regards Ganesh -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Friday, November 22, 2013 1:46 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org; Rich

RE: [RFC] [PATCH, i386] Adjust unroll factor for bdver3 and bdver4

2013-11-28 Thread Gopalasubramanian, Ganesh
if (targetm.loop_unroll_adjust) +nunroll = targetm.loop_unroll_adjust (nunroll, loop); + /* Skip big loops. */ if (nunroll <= 1) { -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Friday, November 22, 2013 1:46 PM To: Gopalasubramanian, Ganesh Cc: gcc-p

RE: [PATCH, i386]: Fix PR56788, _mm_frcz_sd and _mm_frcz_ss ignore their second argument

2013-11-26 Thread Gopalasubramanian, Ganesh
.@gmail.com] Sent: Saturday, November 23, 2013 6:49 PM To: gcc-patches@gcc.gnu.org Cc: Cong Hou; Marc Glisse; Gopalasubramanian, Ganesh Subject: [PATCH, i386]: Fix PR56788, _mm_frcz_sd and _mm_frcz_ss ignore their second argument Hello! Attached patch fixes PR56788, where _mm_frcz_{ss,sd} intri

RE: [RFC] [PATCH, i386] Adjust unroll factor for bdver3 and bdver4

2013-11-21 Thread Gopalasubramanian, Ganesh
Ping! -Original Message- From: Gopalasubramanian, Ganesh Sent: Thursday, November 21, 2013 10:35 AM To: 'H.J. Lu' Cc: gcc-patches@gcc.gnu.org; Uros Bizjak (ubiz...@gmail.com); Richard Guenther (richard.guent...@gmail.com); borntrae...@de.ibm.com; Jakub Jelinek (ja...@

RE: [RFC] [PATCH, i386] Adjust unroll factor for bdver3 and bdver4

2013-11-20 Thread Gopalasubramanian, Ganesh
-Original Message- From: H.J. Lu [mailto:hjl.to...@gmail.com] Sent: Thursday, November 21, 2013 12:02 AM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org; Uros Bizjak (ubiz...@gmail.com); Richard Guenther (richard.guent...@gmail.com); borntrae...@de.ibm.com; Jakub Jelinek (ja...@re

[RFC] [PATCH, i386] Adjust unroll factor for bdver3 and bdver4

2013-11-20 Thread Gopalasubramanian, Ganesh
Hi, Steamroller processors contain a loop predictor and a loop buffer, which may make unrolling small loops less important. When unrolling small loops for steamroller, making the unrolled loop fit in the loop buffer should be a priority. This patch uses a heuristic approach (number of memory re

RE: Honnor ix86_accumulate_outgoing_args again

2013-11-12 Thread Gopalasubramanian, Ganesh
esday, November 12, 2013 3:57 PM To: Jan Hubicka Cc: H.J. Lu; Vladimir Makarov; GCC Patches; Uros Bizjak; Richard Henderson; Gopalasubramanian, Ganesh Subject: Re: Honnor ix86_accumulate_outgoing_args again On Tue, Nov 12, 2013 at 11:05:45AM +0100, Jan Hubicka wrote: > >

RE: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips

2013-11-05 Thread Gopalasubramanian, Ganesh
r 30, 2013 1:54 AM To: Richard Biener Cc: Jan Hubicka; Gopalasubramanian, Ganesh; gcc-patches@gcc.gnu.org; Uros Bizjak (ubiz...@gmail.com); H.J. Lu (hjl.to...@gmail.com) Subject: Re: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips > On Fri, 25 Oct 2013, Jan Hubick

RE: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips

2013-10-25 Thread Gopalasubramanian, Ganesh
day, October 24, 2013 6:48 PM To: Gopalasubramanian, Ganesh Cc: Jan Hubicka; gcc-patches@gcc.gnu.org; Uros Bizjak (ubiz...@gmail.com); H.J. Lu (hjl.to...@gmail.com) Subject: Re: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips > Hi, > > > Is this with -fschedule-

RE: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips

2013-10-24 Thread Gopalasubramanian, Ganesh
[mailto:hubi...@ucw.cz] Sent: Thursday, October 24, 2013 2:54 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org; Uros Bizjak (ubiz...@gmail.com); hubi...@ucw.cz; H.J. Lu (hjl.to...@gmail.com) Subject: Re: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips > At

RE: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips

2013-10-24 Thread Gopalasubramanian, Ganesh
Attached is the patch which does the following scheduler related changes. * re-models bdver3 decoder. * It enables lookahead with value 8 for all BD architectures. The patch doesn't consider if reloading is completed or not (an area that needs to be worked on). * The issue rate for BD architecture

RE: [PATCH,i386] Enable FMA4 for AMD bdver3

2013-10-16 Thread Gopalasubramanian, Ganesh
Cc: Gopalasubramanian, Ganesh; gcc-patches@gcc.gnu.org Subject: Re: [PATCH,i386] Enable FMA4 for AMD bdver3 On Wed, Oct 16, 2013 at 09:00:58AM +0200, Uros Bizjak wrote: > On Wed, Oct 16, 2013 at 8:28 AM, Gopalasubramanian, Ganesh > wrote: > > > The below patch enables FMA4 for AMD bd

[PATCH,i386] Enable FMA4 for AMD bdver3

2013-10-15 Thread Gopalasubramanian, Ganesh
Hi The below patch enables FMA4 for AMD bdver3 architectures. "make -k check" passes. Is it OK for upstream? Regards Ganesh diff --git a/gcc/ChangeLog b/gcc/ChangeLog index fb5b267..cbb5311 100644 --- a/gcc/ChangeLog +++ b/gcc/ChangeLog @@ -1,3 +1,8 @@ +2013-10-16 Ganesh Gopalasubramanian

RE: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips

2013-10-11 Thread Gopalasubramanian, Ganesh
dver3-double" "(bdver3-decode0|bdver3-decode1)*2") +(define_reservation "bdver3-double" "(bdver3-decode0+bdver3-decode1)| + (bdver3-decode1+bdver3-decode2)|(bdver3-decode2+bdver3-decode3)| + (bdver3-decode0+bdver3-decode2)|(bdver3-decode1+bd

RE: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips

2013-10-09 Thread Gopalasubramanian, Ganesh
- From: Jan Hubicka [mailto:hubi...@ucw.cz] Sent: Tuesday, October 08, 2013 3:20 PM To: Gopalasubramanian, Ganesh Cc: Jan Hubicka; gcc-patches@gcc.gnu.org; hjl.to...@gmail.com Subject: Re: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips > Hi Honza, > > I am planning to upda

RE: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips

2013-10-08 Thread Gopalasubramanian, Ganesh
these. Regards Ganesh -Original Message- From: Jan Hubicka [mailto:hubi...@ucw.cz] Sent: Monday, September 30, 2013 4:47 PM To: gcc-patches@gcc.gnu.org; Gopalasubramanian, Ganesh; hjl.to...@gmail.com Subject: Fix scheduler ix86_issue_rate and ix86_adjust_cost for modern x86 chips Hi

RE: Fwd: [PATCH] Scheduling result adjustment to enable macro-fusion

2013-09-30 Thread Gopalasubramanian, Ganesh
> 1. For cmp/test with rip-relative addressing mem operand, don't group > insns. Bulldozer also doesn't support fusion for cmp/test with both > displacement MEM and immediate operand, while m_CORE_ALL doesn't > support fusion for cmp/test with MEM and immediate operand. I simplify > choose to u

RE: [PATCH,i386] Default alignment for AMD BD and BT

2013-08-01 Thread Gopalasubramanian, Ganesh
Thanks Jakub! Committed revision 201402. -Original Message- From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Thursday, July 04, 2013 4:46 PM To: Gopalasubramanian, Ganesh Cc: Uros Bizjak (ubiz...@gmail.com); gcc-patches@gcc.gnu.org Subject: Re: [PATCH,i386] Default alignment for AMD

RE: [PATCH,i386] Default alignment for AMD BD and BT

2013-07-04 Thread Gopalasubramanian, Ganesh
Hi Uros, Can this be backported now! Regards Ganesh -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Thursday, May 30, 2013 1:40 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH,i386] Default alignment for AMD BD and BT On Wed, May

RE: [PATCH,i386] Default alignment for AMD BD and BT

2013-05-29 Thread Gopalasubramanian, Ganesh
Hi We want this to be backported to GCC48 branch. Please approve. Regards Ganesh -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Tuesday, May 07, 2013 6:22 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH,i386] Default alignment for

RE: [PATCH,i386] FP Reassociation for AMD bdver1 and bdver2

2013-05-29 Thread Gopalasubramanian, Ganesh
Thanks Uros! Committed at r199405. -Ganesh -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Thursday, May 23, 2013 4:47 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH,i386] FP Reassociation for AMD bdver1 and bdver2 On Thu, May 23

[PATCH,i386] FP Reassociation for AMD bdver1 and bdver2

2013-05-23 Thread Gopalasubramanian, Ganesh
--Original Message----- From: Gopalasubramanian, Ganesh Sent: Monday, May 13, 2013 5:24 PM To: gcc-patches@gcc.gnu.org Cc: Uros Bizjak (ubiz...@gmail.com) Subject: [PATCH,i386] FSGSBASE for AMD bdver3 Hi The patch enables FSGSBASE instruction generation for AMD bdver3 architectures. &quo

RE: [PATCH, i386]: Update processor_alias_table for missing PTA_PRFCHW and PTA_FXSR flags

2013-05-16 Thread Gopalasubramanian, Ganesh
Thank you Uros for the patch. Could you backport this to the 4.8.0? -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Wednesday, May 15, 2013 11:16 PM To: gcc-patches@gcc.gnu.org Cc: Gopalasubramanian, Ganesh Subject: [PATCH, i386]: Update processor_alias_table for

RE: [PATCH,i386] FSGSBASE for AMD bdver3

2013-05-15 Thread Gopalasubramanian, Ganesh
Thank you Uros! Patch for FSGSBASE instruction generation for AMD bdver3 committed to trunk (rr198916). Regards Ganesh -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Monday, May 13, 2013 5:50 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org Subject

RE: [PATCH,i386] FSGSBASE for AMD bdver3

2013-05-14 Thread Gopalasubramanian, Ganesh
OCESSOR_AMDFAM10, CPU_AMDFAM10, PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM}, -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Monday, May 13, 2013 5:50 PM To: Gopalasubramanian, Ganesh Cc: g

[PATCH,i386] FSGSBASE for AMD bdver3

2013-05-13 Thread Gopalasubramanian, Ganesh
Hi The patch enables FSGSBASE instruction generation for AMD bdver3 architectures. "make -k check" passes. Is it OK for upstream? Regards Ganesh Index: gcc/ChangeLog === --- gcc/ChangeLog (revision 198821) +++ gcc/ChangeLog

RE: [PATCH,i386] Default alignment for AMD BD and BT

2013-05-13 Thread Gopalasubramanian, Ganesh
Thank you Uros! Committed r198820. Regards Ganesh -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Tuesday, May 07, 2013 6:22 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH,i386] Default alignment for AMD BD and BT On Tue, May 7

[PATCH,i386] Default alignment for AMD BD and BT

2013-05-07 Thread Gopalasubramanian, Ganesh
Hi The patch updates the alignment values for AMD BD and BT architectures. "make -k check" passes. Is it OK for upstream? Regards Ganesh 2013-05-07 Ganesh Gopalasubramanian * config/i386/i386.c (processor_target_table): Modified default alignment values for AMD BD and BT architec

RE: [patch][wwwdocs] gcc 4.8 changes - AMD new cores

2013-02-14 Thread Gopalasubramanian, Ganesh
Thank you Gerald! Committed with the changes. Regards Ganesh -Original Message- From: Gerald Pfeifer [mailto:ger...@pfeifer.com] Sent: Thursday, February 14, 2013 2:40 PM To: Gopalasubramanian, Ganesh Cc: gcc-patchesUros Bizjak Subject: RE: [patch][wwwdocs] gcc 4.8 changes - AMD new

RE: [patch][wwwdocs] gcc 4.8 changes - AMD new cores

2013-02-13 Thread Gopalasubramanian, Ganesh
6:38 PM To: Richard Biener Cc: Gopalasubramanian, Ganesh; gcc-patches@gcc.gnu.org; ubizjak at gmail dot com (gcc-bugzi...@gcc.gnu.org); ger...@pfeifer.com Subject: Re: [patch][wwwdocs] gcc 4.8 changes - AMD new cores Le 13/02/2013 14:00, Richard Biener a écrit : > Of course not. Next they

[patch][wwwdocs] gcc 4.8 changes - AMD new cores

2013-02-13 Thread Gopalasubramanian, Ganesh
Hello, This patch adds short words about the new AMD cores that got enabled in GCC-4.8. OK for the wwwdocs? Regards Ganesh Index: gcc-4.8/changes.html === RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-4.8/changes.html,v retrieving revisio

RE: [PATCH, i386]: AMD bdver3 enablement

2012-11-15 Thread Gopalasubramanian, Ganesh
ember 14, 2012 4:15 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH, i386]: AMD bdver3 enablement On Wed, Nov 14, 2012 at 10:22 AM, Gopalasubramanian, Ganesh wrote: >> sseshuf replaces sselog in some insn patterns, but should be handled in the >> s

RE: [PATCH, i386]: AMD bdver3 enablement

2012-11-11 Thread Gopalasubramanian, Ganesh
essage- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Monday, November 12, 2012 2:30 AM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH, i386]: AMD bdver3 enablement On Fri, Nov 9, 2012 at 4:39 AM, Gopalasubramanian, Ganesh wrote: > Changes done with respe

Add myself to MAINTAINERS

2012-10-30 Thread Gopalasubramanian, Ganesh
2,6 +372,7 @@ Chao-ying Fu f...@mips.com Gary Funck g...@intrepid.com Pompapathi V Gadad pompapathi.v.ga...@nsc.com +Gopalasubramanian Ganesh ganesh.gopalasubraman...@amd.com K

RE: GCC 4.8.0 Status Report (2012-10-29), Stage 1 to end soon

2012-10-29 Thread Gopalasubramanian, Ganesh
Hi Jakub, We are working on the following. 1. bdver3 enablement. Review completed. Changes to be incorporated and checked-in. http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01131.html 2. btver2 basic enablement is done (http://gcc.gnu.org/ml/gcc-patches/2012-07/msg01018.html)/ Scheduler descript

RE: [PATCH, i386]: Fix PR51109, symbol size in scheduler state machine is reduced

2012-10-10 Thread Gopalasubramanian, Ganesh
That was obvious. Sorry for the wrong commit. Thanks Jakub. -Ganesh -Original Message- From: Paolo Carlini [mailto:paolo.carl...@oracle.com] Sent: Wednesday, October 10, 2012 4:33 PM To: Jakub Jelinek Cc: Gopalasubramanian, Ganesh; Uros Bizjak; gcc-patches@gcc.gnu.org; veku

RE: [PATCH, i386]: Fix PR51109, symbol size in scheduler state machine is reduced

2012-10-03 Thread Gopalasubramanian, Ganesh
Testing was done before posting the patch. It was successful. Regards Ganesh -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Thursday, September 27, 2012 5:57 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH, i386]: Fix PR51109

[PATCH, i386]: Fix PR51109, symbol size in scheduler state machine is reduced

2012-09-27 Thread Gopalasubramanian, Ganesh
Hi All, This is a fix for PR 51109. There are three changes 1. Microcoded instructions are considered as single issue instructions and are therefore issued to a separate execution unit. 2. The multiplier unit is attached to execution unit 1 (ieu1). Since ieu is handled as a separate

RE: [PATCH,i386] fma4 addition for bdver2

2012-09-09 Thread Gopalasubramanian, Ganesh
Message- From: Gopalasubramanian, Ganesh Sent: Wednesday, September 05, 2012 3:41 PM To: gcc-patches@gcc.gnu.org Cc: Uros Bizjak (ubiz...@gmail.com) Subject: [PATCH,i386] fma4 addition for bdver2 Hello, FMA4 and FMA3 ISA are implemented in bdver2 target. FMA3 is selected by default. This patch

[PATCH,i386] fma4 addition for bdver2

2012-09-05 Thread Gopalasubramanian, Ganesh
Hello, FMA4 and FMA3 ISA are implemented in bdver2 target. FMA3 is selected by default. This patch supports the use of FMA4 intrinsics for bdver2 targets. Is it OK for trunk? Regards Ganesh 2012-09-05 Ganesh Gopalasubramanian * config/i386/i386.md : Comments on fma4 instruction

RE: [PATCH,i386] fma,fma4 and xop flags

2012-08-16 Thread Gopalasubramanian, Ganesh
nctionality to support that. So, ideally for bdver2, we like to have both fma and fma4 getting generated with options "-mfma -mfma4". Regards Ganesh -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Tuesday, August 14, 2012 9:12 PM To: Richard Henderson Cc: G

RE: [PATCH,i386] cpuid function for prefetchw

2012-08-13 Thread Gopalasubramanian, Ganesh
Yes! Thanks Jakub. -Original Message- From: Jakub Jelinek [mailto:ja...@redhat.com] Sent: Monday, August 13, 2012 3:16 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH,i386] cpuid function for prefetchw On Mon, Aug 13, 2012 at 09:29:45AM +

[PATCH,i386] cpuid function for prefetchw

2012-08-13 Thread Gopalasubramanian, Ganesh
Hello, To get the prefetchw cpuid flag, cpuid function 0x8001 needs to be called. Previous to patch, function 0x7 is called. Bootstrapping and "make -k check" passes without failures. Ok for trunk? Regards Ganesh 2012-08-13 Ganesh Gopalasubramanian PR driver/54210

RE: [PATCH,i386] fma,fma4 and xop flags

2012-08-12 Thread Gopalasubramanian, Ganesh
Thank you Uros, Richard! I will confirm the test results in couple off days. Regards Ganesh -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Saturday, August 11, 2012 3:54 AM To: Richard Henderson Cc: Gopalasubramanian, Ganesh; gcc-patches@gcc.gnu.org Subject: Re

RE: [PATCH,i386] fma,fma4 and xop flags

2012-08-09 Thread Gopalasubramanian, Ganesh
ags. This will be a one to one mapping and leave the user with lot more liberty. Please let me know your opinion. Regards Ganesh -Original Message- From: Uros Bizjak [mailto:ubiz...@gmail.com] Sent: Friday, August 10, 2012 1:21 AM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org S

RE: [PATCH,i386] fma,fma4 and xop flags

2012-08-08 Thread Gopalasubramanian, Ganesh
w your opinion. Regards Ganesh -Original Message- From: Richard Guenther [mailto:richard.guent...@gmail.com] Sent: Wednesday, August 08, 2012 5:12 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org; ubiz...@gmail.com Subject: Re: [PATCH,i386] fma,fma4 and xop flags On Wed, Aug 8

Backport: fma3 instruction generation for 'march=native' in AMD processors

2012-05-09 Thread Gopalasubramanian, Ganesh
Hello, Below is the patch that has been committed in trunk (Revision: 187075). We like to backport it to GCC 4.7 branch as couple of AMD processors require this change for fma3 instruction generation. Bootstrapping and testing are successful. Is it OK to commit in GCC 4.7 branch? Regards Gane

Re: [PATCH] [i386] fma3 instruction generation for 'march=native' in AMD processors

2012-05-02 Thread Gopalasubramanian, Ganesh
Sent: Wednesday, May 02, 2012 5:11 PM To: Gopalasubramanian, Ganesh Cc: gcc-patches@gcc.gnu.org Subject: Re: [PATCH] [i386] fma3 instruction generation for 'march=native' in AMD processors On Wed, May 02, 2012 at 11:12:33AM +, Gopalasubramanian, Ganesh wrote: > For AMD architectures with both

[PATCH] [i386] fma3 instruction generation for 'march=native' in AMD processors

2012-05-02 Thread Gopalasubramanian, Ganesh
For AMD architectures with both fma3 and fma4 instructions' support, GCC generates fma4 by default. Instead, we like to generate fma3 instruction. Below patch enables the fma3 instruction generation for "-march=native". Ok for trunk? Index: gcc/config/i386/driver-i386.c