On Mon, Jan 6, 2014 at 7:36 AM, Marcus Shawcroft
wrote:
> Hi,
>
> This patch defines the AArch64 BE loader name. Corresponding patches for
> glibc and binutils have been posted on the relevant lists.
This is a huge ABI change and makes GCC 4.8.x incompatible with GCC 4.9.0.
Thanks,
Andrew
>
>
hat pointlessly); John
>> Anglin said in the PR that it tests ok on PA. Will commit in a few days
>> if no objections.
>
> No objections to the substance of the patch, though I think the comment
> could be clearer.
Though my question is what target does this matter since ARM has moved
away from reload and other targets should do the same?
Thanks,
Andrew Pinski
>
> Jeff
>
unsigned n, m;
do {
m = *--p;
*p = (unsigned short)(m >= wsize ? m-wsize : 0);
} while (--n);
}
This comes from zlib and it blocks my building of the trunk.
Thanks,
Andrew Pinski
>
> Thanks,
> James Greenhalgh
>
> ---
> gcc/
>
> 2012-12
On Tue, Jan 7, 2014 at 4:05 PM, Marcus Shawcroft
wrote:
>
> On 7 Jan 2014, at 23:10, Andrew Pinski wrote:
>
>> On Tue, Dec 4, 2012 at 2:31 AM, James Greenhalgh
>> wrote:
>>>
>>> Hi,
>>>
>>> This patch adds support fo
On Thu, Nov 14, 2013 at 11:12 AM, Xinliang David Li wrote:
> On Thu, Nov 14, 2013 at 10:17 AM, Andrew Pinski wrote:
>> On Thu, Nov 14, 2013 at 8:25 AM, Xinliang David Li
>> wrote:
>>> Can we revisit the decision for this? Here are the reasons:
>>>
>>>
On Fri, Jan 10, 2014 at 7:14 AM, Richard Earnshaw wrote:
> It's incorrect to use CMN to compare with a negated operand if the
> following condition is an inequality. This is because of boundary
> conditions when the negated operations overflow (or when zero), since
> the flags are then not the sw
On Fri, Jan 10, 2014 at 9:38 AM, Richard Earnshaw wrote:
> On 10/01/14 17:37, Andrew Pinski wrote:
>> On Fri, Jan 10, 2014 at 7:14 AM, Richard Earnshaw wrote:
>>> It's incorrect to use CMN to compare with a negated operand if the
>>> following condition is an
ables (generic_tunings and cortexa53_tunings)
to be 1 which was the default before.
OK? Built and tested for aarch64-elf with no regressions.
Thanks,
Andrew Pinski
ChangeLog:
* config/aarch64/aarch64-protos.h (tune_params): Add issue_rate.
* config/aarch64/aarch64.c (generic_tunings): Add issue
K_REG and FP_REGS to the cost of moving via a GENERAL_REGS.
OK? Built and tested on aarch64-elf with no regressions.
Thanks,
Andrew Pinski
ChangeLog:
* config/aarch64/aarch64.c (aarch64_register_move_cost): Correct cost
of moving from/to the STACK_REG register class.
83 ]))) abs.c:1 319 {negsi2}
> (expr_list:REG_DEAD (reg:SI 78 [ D.2683 ])
> (nil)))
What does combine do when it props 7 into 8? I suspect you want to
optimize that instead of doing it any other way.
That is if prop the neg into the two sides of the conditional and if
one simpli
On Tue, Jul 14, 2015 at 3:06 AM, Andrew Pinski wrote:
> On Tue, Jul 14, 2015 at 1:18 AM, Kyrill Tkachov
> wrote:
>> Hi Segher,
>>
>> On 14/07/15 01:38, Segher Boessenkool wrote:
>>>
>>> On Mon, Jul 13, 2015 at 10:48:19AM +0100, Kyrill Tkachov wrote:
On Tue, Jul 14, 2015 at 3:13 AM, Kyrill Tkachov wrote:
>
> On 14/07/15 11:06, Andrew Pinski wrote:
>>
>> On Tue, Jul 14, 2015 at 1:18 AM, Kyrill Tkachov
>> wrote:
>>>
>>> Hi Segher,
>>>
>>> On 14/07/15 01:38, Segher Boessenkool wrote
On Fri, Jul 17, 2015 at 8:43 AM, Benedikt Huber
wrote:
> * config/aarch64/aarch64-builtins.c: Builtins
> for rsqrt and rsqrtf.
> * config/aarch64/aarch64-protos.h: Declare.
> * config/aarch64/aarch64-simd.md: Matching expressions
> for frsqrte and frsqrts.
>
On Sat, Jul 18, 2015 at 1:25 AM, Andrew Pinski wrote:
> On Fri, Jul 17, 2015 at 8:43 AM, Benedikt Huber
> wrote:
>> * config/aarch64/aarch64-builtins.c: Builtins
>> for rsqrt and rsqrtf.
>> * config/aarch64/aarch64-protos.h: Declare.
>>
On Thu, Jul 16, 2015 at 8:33 AM, Kyrill Tkachov wrote:
> Hi all,
>
> This patch improves codegen for expressions of the form:
> (x ? y + c1 : y + c2) when |c1 - c2| == 1
>
> It matches the if_then_else of the two plus-immediates,
> performs one of them, then generates a conditional increment
> ope
On Tue, Jul 21, 2015 at 12:16 PM, Richard Biener
wrote:
> On July 21, 2015 11:38:31 AM GMT+02:00, Jakub Jelinek
> wrote:
>>On Tue, Jul 21, 2015 at 09:15:31AM +, Hurugalawadi, Naveen wrote:
>>> Please find attached the patch which performs following patterns
>>folding
>>> in match.pd:-
>>>
>>
On Fri, Jul 24, 2015 at 2:07 AM, Jiong Wang wrote:
>
> James Greenhalgh writes:
>
>> On Wed, May 20, 2015 at 01:35:41PM +0100, Jiong Wang wrote:
>>> Current IRA still use both target macros in a few places.
>>>
>>> Tell IRA to use the order we defined rather than with it's own cost
>>> calculation
On Tue, Jul 28, 2015 at 1:35 PM, Richard Sandiford
wrote:
> Continuing after a break for the fr30 patch...
>
> Bootstrapped & regression-tested on x86_64-linux-gnu and aarch64-linux-gnu.
> Also tested via config-list.mk. Committed as preapproved.
>
> Thanks,
> Richard
>
>
> gcc/
> * targe
rtx target, rtx mem, enum memmodel model)
> {
>machine_mode pat_bool_mode;
>struct expand_operand ops[3];
>
> - if (!HAVE_atomic_test_and_set)
> + if (!targetm.have_atomic_test_and_set ())
> return NULL_RTX;
I know this was not there before but this if should be ma
On Tue, Jul 28, 2015 at 3:10 PM, Richard Sandiford
wrote:
> Andrew Pinski writes:
>> On Tue, Jul 28, 2015 at 1:36 PM, Richard Sandiford
>> wrote:
>>> Bootstrapped & regression-tested on x86_64-linux-gnu and aarch64-linux-gnu.
>>> Also tested via conf
. Other variants are 'target_clones', 'targets'...
How does this interacts with Function Multiversioning
(https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Function-Multiversioning.html)?
Thanks,
Andrew Pinski
>
> Below is ChangeLog. Attached patch passed make check on x86.
On Fri, Jul 24, 2015 at 3:55 AM, Kyrill Tkachov wrote:
> Hi all,
>
> This patch implements an aarch64-specific expansion of the signed modulo by
> a power of 2.
> The proposed sequence makes use of the conditional negate instruction CSNEG.
> For a power of N, x % N can be calculated with:
> negs
On Wed, Aug 5, 2015 at 3:16 AM, Richard Biener wrote:
> On Wed, 5 Aug 2015, Andreas Schwab wrote:
>
>> Richard Biener writes:
>>
>> > * gimple-fold.c (gimple_fold_stmt_to_constant_1): Canonicalize
>> > bool compares on RHS.
>> > * match.pd: Add X ==/!= !X is false/true pattern.
>>
>>
On Tue, Sep 15, 2015 at 6:58 AM, Richard Biener
wrote:
> On Thu, Sep 3, 2015 at 5:32 PM, Bill Schmidt
> wrote:
>> On Thu, 2015-09-03 at 23:26 +0800, Andrew Pinski wrote:
>>> On Thu, Sep 3, 2015 at 11:20 PM, Bill Schmidt
>>> wrote:
>>> > Hi,
>>
; instructions
>
> Tested the series for aarch64-none-linux-gnu with native bootstrap and
> make check. Also tested for aarch64-none-elf with cross-compiled
> check-gcc on an ARMv8.1 emulator with +lse enabled by default.
Are you going to add some builtins for MIN/MAX support to
On Tue, Jul 28, 2015 at 6:12 AM, Jiong Wang wrote:
>
> The instruction sequences for preparing argument for TLS descriptor
> runtime resolver and the later function call to resolver can actually be
> hoisted out of the loop.
>
> Currently we can't because we have exposed the hard register X0 as
>
On Fri, Sep 25, 2015 at 11:40 PM, Andrew Pinski wrote:
> On Tue, Jul 28, 2015 at 6:12 AM, Jiong Wang wrote:
>>
>> The instruction sequences for preparing argument for TLS descriptor
>> runtime resolver and the later function call to resolver can actually be
>&
ar * env_var = (char*)
>> malloc(sizeof("COI_DMA_CHANNEL_COUNT=2" + 1));
>> +char * env_var = (char*)
>> malloc(sizeof("COI_DMA_CHANNEL_COUNT=2"));
>> sprintf(env_var, "COI_DMA_CHANNEL_COUNT=2");
>>
On Mon, Sep 28, 2015 at 4:52 PM, Evandro Menezes wrote:
> In some micro-architectures the insns to load or store pairs of vector
> registers are implemented rather differently from those affecting lanes in
> vector registers. Then, it's important that such insns be described
> likewise differentl
On Sat, Oct 3, 2015 at 9:44 AM, Sandra Loosemore
wrote:
> On 10/03/2015 06:47 AM, Jonathan Wakely wrote:
>>
>> https://gcc.gnu.org/onlinedocs/gcc/Template-Instantiation.html
>> currently says that using -frepo "is your best option for application
>> code written for the Borland model, as it just w
or this?). Does this look reasonable?
I thought if profile is not present, then branch probabilities goes
back to the original heuristics?
Which option is really causing the performance degradation here?
Also I think your patch is very incomplete as someone could use
-frename-registers with -fp
QEMU exits with
> EINTR error code (that might be expected, AFAIK QEMU is not very good with
> threads). So, I wonder, if I should disable LSan for AArch64 now?
We should just disable ASAN and TSAN for AARCH64 until 48bit VA is
supported. Since the majority of the distros are going to be u
On Wed, Oct 14, 2015 at 11:38 AM, Renato Golin wrote:
> On 14 October 2015 at 19:21, Evgenii Stepanov
> wrote:
>> Wait. As Jakub correctly pointed out in the other thread, there is no
>> obvious reason why there could not be a single shadow offset value
>> that would work for all 3 possible VMA
On Wed, Oct 14, 2015 at 12:15 PM, Renato Golin wrote:
> On 14 October 2015 at 20:00, Andrew Pinski wrote:
>> Then until that happens I think we should disable asan and tsan for
>> AARCH64 for GCC.
>
> I can't comment on that, but we'll continue running the tests
On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes wrote:
> In the existing targets, it seems that it's always faster to zero up a DF
> register with "movi %d0, #0" instead of "fmov %d0, xzr".
I think for ThunderX 1, this change will not make a difference. So I
am neutral on this change.
Thanks,
On Tue, Oct 20, 2015 at 7:10 AM, Evandro Menezes wrote:
> Some micro-architectures may favor one of sign or zero extension over the
> other in the base plus extended register offset addressing mode.
Yes I was going to create the same patch as ThunderX is one of those
micro-architectures.
Thanks,
On Tue, Oct 20, 2015 at 7:51 AM, Andrew Pinski wrote:
> On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes
> wrote:
>> In the existing targets, it seems that it's always faster to zero up a DF
>> register with "movi %d0, #0" instead of "fmov %d0, xzr".
&
On Tue, Oct 20, 2015 at 7:59 AM, Andrew Pinski wrote:
> On Tue, Oct 20, 2015 at 7:51 AM, Andrew Pinski wrote:
>> On Tue, Oct 20, 2015 at 7:40 AM, Evandro Menezes
>> wrote:
>>> In the existing targets, it seems that it's always faster to zero up a DF
>>> r
On Thu, Oct 22, 2015 at 3:47 AM, Segher Boessenkool
wrote:
> Hi,
>
> On Wed, Oct 21, 2015 at 09:44:25AM -0700, Steve Ellcey wrote:
>> (jump_insn 16 15 17 2 (set (pc)
>> (if_then_else (ne (subreg:SI (reg:DI 207) 4)
>> (subreg:SI (reg:DI 196 [ *last_3(D)+-4 ]) 4))
>>
On Thu, Oct 22, 2015 at 3:32 AM, wrote:
>
>
>> On Oct 22, 2015, at 12:44 AM, Steve Ellcey wrote:
>>
>>
>> A bug was reported against the GCC MIPS64 compiler that involves a bad
>> combine
>> and this patch fixes the bug.
>>
>> When using '-fexpensive-optimizations -march=mips64r2 -mabi=64' GCC
On Sun, Oct 25, 2015 at 7:51 PM, Alan Lawrence wrote:
> On 23 October 2015 at 16:20, Alan Lawrence wrote:
>> diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c
>> b/gcc/testsuite/gcc.dg/vect/bb-slp-7.c
>> index ab54a48..b012d78 100644
>> --- a/gcc/testsuite/gcc.dg/vect/bb-slp-7.c
>> +++ b/gcc/tes
On Mon, Oct 26, 2015 at 4:39 PM, Eric Botcazou wrote:
> Hi,
>
> this patch extends the simplification formerly done in fold_widened_comparison
> and now in match.pd to all integral types instead of just integer types, and
> comes with an Ada testcase for the enumeral type case.
>
> The patch intro
On Wed, Oct 28, 2015 at 6:36 PM, James Greenhalgh
wrote:
> On Tue, Oct 27, 2015 at 06:12:48PM -0500, Evandro Menezes wrote:
>> This patch adds the scheduling and cost models for Exynos M1.
>>
>> Though it?s a rather large patch, much of it is the DFA model for the
>> pipeline.? Still, I?d apprecia
On Fri, Oct 30, 2015 at 6:06 PM, Kumar, Venkataramanan
wrote:
> Hi Richard,
>
> I am trying to "if covert the store" in the below test case and later help it
> to get vectorized under -Ofast -ftree-loop-if-convert-stores -fno-common
>
> #define LEN 4096
> __attribute__((aligned(32))) float array
On Fri, Oct 30, 2015 at 7:48 PM, wrote:
> From: Trevor Saunders
>
> gcc got rid of this target macro in 2003, so it seems safe to assume the
> alternate path works fine on all targets.
This is ok.
>
> libobjc/ChangeLog:
>
> 2015-10-30 Trevor Saunders
>
> PR libobjc/24775
>
On Fri, Oct 30, 2015 at 7:48 PM, wrote:
> From: Trevor Saunders
>
> Given the layering violation that using ROUND_TYPE_ALIGN in target libs
> is, and the hacks needed to make it work just coppying the relevant code
> into encoding.c seems to make sense as an incremental improvement. The
> epiph
On Fri, Oct 30, 2015 at 7:48 PM, wrote:
> From: Trevor Saunders
>
> Similar to ROUND_TYPE_ALIGN it seems to make sense to copy the
> information in the target macros to libobjc as an incremental step. Its
> worth noting a large portion of the definitions of this macro only exist
> to work aroun
On Fri, Oct 30, 2015 at 9:11 PM, Bernd Schmidt wrote:
> On 10/30/2015 01:47 PM, Richard Biener wrote:
>>
>> On Fri, Oct 30, 2015 at 1:28 PM, Bernd Schmidt
>> wrote:
it's not target independent code. Are you suggesting to add a config/
to libobjc? IMHO for a not really mantai
On Sat, Nov 14, 2015 at 1:36 AM, Wilco Dijkstra wrote:
>> Evandro Menezes wrote:
>> Hi, Wilco.
>>
>> It looks good to me, but FCMP is quite different from FCCMP on Exynos M1,
>> so it'd be helpful to have distinct types for them. Say, "fcmp{s,d}"
>> and "fccmp{s,d}". Would it be acceptable to add
On Mon, Nov 16, 2015 at 8:31 AM, Matthew Wahab
wrote:
> Hello,
>
> The command line options for target selection allow ARMv8.1 extensions
> to be individually enabled/disabled. They also allow the extensions to
> be enabled with -march=armv8-a. This doesn't reflect the ARMv8.1
> architecture which
Just to make it easier to see which cores blong to which company and
the order clearier, add a comment in front of the cores sections.
OK?
Thanks,
Andrew Pinski
* config/aarch64/aarch64-cores.def: add a comment before each set
of cores.
---
gcc/config/aarch64/aarch64-cores.def | 9
The reason why thunderxt88pass1 is seperate from thunderx is because
thunderx is changed to be an ARMv8.1 arch core while thunderxt88pass1
is still an ARMv8 arch core.
I tested each of these patches seperately.
Ok for the trunk even though I missed out on stage 1?
Thanks,
Andrew
Andrew Pins
num is only 12bits
long. So it would be nice if someone could test -mpu=native on a big.little
system to make sure it works still.
OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.
Thanks,
Andrew Pinski
* config/aarch64/aarch64-cores.def: Rewrite so IMP and PART are i
? Bootstrapped and tested on aarch64-linux-gnu with no regressions.
Tested -mcpu=native on both T88 pass 1 and T88 pass 2 to make sure it is
deecting the two seperately.
Thanks,
Andrew Pinski
* config/aarch64/aarch64-cores.def: Add -1 as the variant to all of the cores.
(thunderxt88pass1): New core
This moves the #undef from the header files to the .def files like was done
for builtins.def (https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00662.html).
OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.
Thanks,
Andrew Pinski
* config/aarch64/aarch64-arches.def
just to make sure
the parsing is done correctly as I don't have access to one off hand.
OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.
Thanks,
Andrew Pinski
* config/aarch64/driver-aarch64.c (host_detect_local_cpu):
Rewrite handling of part num to handle the case
On Thu, Nov 19, 2015 at 4:08 AM, Kyrill Tkachov wrote:
> Hi Andrew,
>
> On 17/11/15 22:10, Andrew Pinski wrote:
>>
>> To Add support for -mcpu=thunderxt88pass1, I needed to fix up a few
>> things in the support for -mcpu=native. First was I wanted to do the same
>&g
On Mon, Nov 23, 2015 at 9:53 PM, H.J. Lu wrote:
> On Mon, Nov 23, 2015 at 7:22 PM, Patrick Palka wrote:
>> On Mon, Nov 23, 2015 at 3:53 PM, H.J. Lu wrote:
>>> On Mon, Nov 23, 2015 at 1:57 AM, Richard Biener
>>> wrote:
On Sat, Nov 21, 2015 at 12:46 AM, H.J. Lu wrote:
> On Fri, Nov 20,
On Wed, Nov 25, 2015 at 2:31 AM, James Greenhalgh
wrote:
> On Tue, Nov 17, 2015 at 02:10:35PM -0800, Andrew Pinski wrote:
>>
>> Because the imp and parts are really integer rather than strings, this patch
>> moves the comparisons to be integer. Also allows saving around i
On Mon, Feb 17, 2020 at 7:56 AM Richard Earnshaw (lists)
wrote:
>
> On 17/02/2020 15:42, Richard Sandiford wrote:
> > "Richard Earnshaw (lists)" writes:
> >> On 14/02/2020 10:41, Andrew Pinski wrote:
> >>> On Fri, Feb 14, 2020 at 2:12 AM Richard Earnsha
reduce many byte stores and
> > half stores
> > to improve performance for this type of case. There is already an
> > 033t.esra running before,
> > and not sure whether SRA should replace such kind of bitfield operations.
> > Adding a store-merging pass is so simple and
n\]*" $text "" text
> +regsub -all "(^|\n)collect2(\.exe)?: error: ld returned \[^\n\]*" $text
> "" text
If you touch that line
You may as well also touch this line too:
> regsub -all "(^|\n)collect: re(compiling|linking)\[^\n\]*" $
On Mon, Mar 2, 2020 at 1:40 AM Richard Biener
wrote:
>
> On Mon, Mar 2, 2020 at 9:07 AM bin.cheng wrote:
> >
> > Hi,
> > This is a simple fix for PR93674. It adds cand carefully for enumeral type
> > iv_use in
> > case of -fstrict-enums, it also avoids computing, replacing iv_use with the
> >
e this is memory corruption. Neither patch should have changed
> anything in the C++ frontend.
It sounds like some GC issue. The patch would have changed a few
things related to the front-end though. Mainly the decl UIDs do
increase due to the new builtins. Note most likely Deli's patch did
the same too.
Thanks,
Andrew Pinski
>
> Cheers,
> Wilco
>
align = LOCAL_DECL_ALIGNMENT (var);
> +
> + SET_DECL_ALIGN (var, align);
I think this is wrong if the user has set the alignment already.
You need to check DECL_USER_ALIGN.
Thanks,
Andrew Pinski
> + }
> +}
> +}
> +
> +/* Adjust alignment for glob
On Wed, Oct 2, 2019 at 9:52 PM Fangrui Song wrote:
>
> On 2019-09-24, Martin Liška wrote:
> >On 9/19/19 10:33 AM, Martin Liška wrote:
> >> - One needs modified binutils and I that would probably require a
> >> configure detection. The only way
> >> which I see is based on ld --version. I'm plan
On Thu, Oct 3, 2019 at 12:46 AM Fangrui Song wrote:
>
>
> On 2019-10-03, Andrew Pinski wrote:
> >On Wed, Oct 2, 2019@9:52 PM Fangrui Song wrote:
> >>
> >> On 2019-09-24, Martin Liška wrote:
> >> >On 9/19/19 10:33 AM, Martin Liška wrote:
> >
Hi all,
While working on implementing lowering of bit-field accesses in
gimple, I ran into an ICE which was not covered by the current
testsuite.
Committed these two new testcases as obvious.
Thanks,
Andrew Pinski
ChangeLog:
* gcc.c-torture/compile/20191015-1.c: New test.
* gcc.c-torture
On Sun, Nov 17, 2019 at 3:35 PM Richard Sandiford
wrote:
>
> (It's 23:35 local time, so it's still just about stage 1. :-))
>
> While working on SVE, I've noticed several cases in which we fail
> to combine instructions because the combined form would need to be
> placed earlier in the instruction
On Fri, Aug 18, 2017 at 12:17 PM Andrew Pinski wrote:
>
> Like https://gcc.gnu.org/ml/gcc-patches/2010-09/msg00060.html for
> PowerPC, we should do something similar for aarch64. This pattern
> does show up in SPEC CPU 2006 in astar but I did not look into
> performance improveme
ntioned, the problem only shows up with
--enable-maintainer-mode which nobody uses as the requirements for
automake/autoconfig is different through out of a combined tree.
Thanks,
Andrew Pinski
>
On Tue, Nov 26, 2019 at 2:30 PM Richard Sandiford
wrote:
>
> Andrew Pinski writes:
> > On Fri, Aug 18, 2017 at 12:17 PM Andrew Pinski wrote:
> >>
> >> Like https://gcc.gnu.org/ml/gcc-patches/2010-09/msg00060.html for
> >> PowerPC, we should do somet
On Tue, Dec 5, 2017 at 11:27 AM Mike Stump wrote:
>
> On Dec 5, 2017, at 11:11 AM, Thomas Preudhomme
> wrote:
> >
> > On 05/12/17 17:54, Andrew Pinski wrote:
> >> On Tue, Dec 5, 2017 at 9:50 AM, Thomas Preudhomme
> >> wrote:
> >>> Hi,
> &g
On Thu, Aug 16, 2018 at 9:29 PM Omar Sandoval wrote:
>
> Hi,
>
> This fixes the issue that it is impossible to distinguish a zero-length array
> type from a flexible array type given the DWARF produced by GCC (which I
> reported here [1]). We do so by adding a DW_AT_count attribute with a value of
gt; In spec2k6/hmmer, when building fast_algorithms.c with below command
>>>> line:
>>>> ./gcc -Ofast -S fast_algorithms.c -o fast_algorithms.S -fdump-tree-all
>>>> -fdump-tree-lsplit
>>>> The lsplit dump contains:
>>>> [12.75%]:
>>>> _124 = _197 + 1;
>>>> _123 = _124 + -1;
>>>> _115 = MIN_EXPR <_197, _124>;
>>>> Which is generated here.
>>>
>>>
>>> That means we miss a pattern in match.PD to handle this case.
>>
>> I see. I will withdraw this patch and look in that direction.
>
>
> For _123, we have
>
> /* (A +- CST1) +- CST2 -> A + CST3
> or
> /* Associate (p +p off1) +p off2 as (p +p (off1 + off2)). */
>
>
> For _115, we have
>
> /* min (a, a + CST) -> a where CST is positive. */
> /* min (a, a + CST) -> a + CST where CST is negative. */
> (simplify
> (min:c @0 (plus@2 @0 INTEGER_CST@1))
> (if (TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (@0)))
>(if (tree_int_cst_sgn (@1) > 0)
> @0
> @2)))
>
> What is the type of all those SSA_NAMEs?
https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01352.html
which added the min/max patterns. I forgot to get Naveen to mention I
saw this while looking into loop splitting and why I was adding them.
Thanks,
Andrew Pinski
>
> --
> Marc Glisse
On Wed, Jun 7, 2017 at 10:16 AM, James Greenhalgh
wrote:
> On Fri, Dec 30, 2016 at 10:05:26PM -0800, Andrew Pinski wrote:
>> Hi,
>> Currently for the following function:
>> int f(int a, int b)
>> {
>> return a + (b <<7);
>> }
>>
>&
)
] UNSPECV_ATOMIC_CMPSW))
])
"/home/apinski/src/local5/gcc/gcc/testsuite/gcc.target/aarch64/atomic_cmp_exchange_zero_reg_1.c":8
-1
(nil))
during RTL pass: vregs
Note also your new testcase is broken even for defaulting to +lse as
it is not going to ma
Here is the updated patch based on the new infrastructure which is now included.
OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions
and tested again on SPEC CPU 2006 on THunderX T88 with the speed up
mentioned before.
Thanks,
Andrew Pinski
ChangeLog:
* config/aarch64/aarch64
On Tue, Jun 20, 2017 at 6:50 AM, James Greenhalgh
wrote:
>
> Hi,
>
> While GCC doesn't need to know anything about the RcPc extension for code
> generation, we do need to add the extension flag to the string we pass
> to the assembler when we're compiling for a CPU which implements the RcPc
> exte
On Mon, Jun 19, 2017 at 2:00 PM, Andrew Pinski wrote:
> On Wed, Jun 7, 2017 at 10:16 AM, James Greenhalgh
> wrote:
>> On Fri, Dec 30, 2016 at 10:05:26PM -0800, Andrew Pinski wrote:
>>> Hi,
>>> Currently for the following function:
>>> int f(int
On Thu, Jun 22, 2017 at 10:02 AM, Mike Stump wrote:
> On Jun 22, 2017, at 8:32 AM, Jeff Law wrote:
>>
>> Sure. I'll do something with 20031023-1.c to ensure it or an equivalent
>> is compiled with -fstack-check. That isn't totally unexpected. I
>> would have also been receptive to adding -fst
int.
Thanks,
Andrew Pinski
ChangeLog:
* config/aarch64/aarch64-cost-tables.h (thunderx2t99_extra_costs):
Increment Arith_shift, Arith_shift_reg, Log_shift, Log_shift_reg and
Extend_arith by 1.
Index: gcc/config/aarch64/aarch64-cost-tables.h
0 ? -1 : 1) into ABS(X).
Transform X * (X < 0.0 ? -1.0 : 1.0) into ABS(X).
Transform X * (X <= 0.0 ? -1.0 : 1.0) into ABS(X).
The floating points ones only happen when not honoring SNANS and not
honoring signed zeros.
OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.
Thanks,
Andrew Pinski
ChangeLog:
* match.pd ( X * (X >/>=/
Forgot the patch
On Fri, Jun 23, 2017 at 8:59 PM, Andrew Pinski wrote:
> Hi,
> I saw this on llvm's review site (https://reviews.llvm.org/D34579)
> and I thought why not add it to GCC. I expanded more than what was
> done on the LLVM patch.
>
> I added the following opt
On Sat, Jun 24, 2017 at 5:34 AM, Marc Glisse wrote:
> Hello,
>
> I remember wanting to add this when the undefined-overflow case was
> introduced a while ago.
>
> It turns out the tree where I wrote this wasn't clean. Since the rest is
> details, I am including it in this patch, hope it is ok.
Yo
On Fri, Jun 23, 2017 at 11:50 PM, Marc Glisse wrote:
> On Fri, 23 Jun 2017, Andrew Pinski wrote:
>
>> Hi,
>> I saw this on llvm's review site (https://reviews.llvm.org/D34579)
>> and I thought why not add it to GCC. I expanded more than what was
>> done o
On Sat, Jun 24, 2017 at 12:47 PM, Marc Glisse wrote:
> On Sat, 24 Jun 2017, Andrew Pinski wrote:
>
>>> * if X is NaN, we may get a qNaN with the wrong sign bit. We probably
>>> don't
>>> care much though...
>>
>>
>> Ok, I changed it to whe
copysign (1.0, -X) into -abs(X).
Transform copysign (-1.0, X) into copysign (1.0, X).
The last one is there so if someone decides to writes -1.0 instead of
1.0 in the code we would get the optimization still.
OK? Bootstrapped and tested on aarch64-linux-gnu with no regressions.
Thanks,
On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
wrote:
> Hi All,
>
> this patch implements a optimization rewriting
>
> x * copysign (1.0, y) and
> x * copysign (-1.0, y)
This reminds me:
copysign(-1.0, y) can be just optimized to:
copysign(1.0, y)
I did that in my patch here:
https://gcc.gnu
On Sun, Jun 25, 2017 at 1:28 AM, Marc Glisse wrote:
> +(for cmp (gt ge lt le)
> + outp (convert convert negate negate)
> + outn (negate negate convert convert)
> + /* Transform (X > 0.0 ? 1.0 : -1.0) into copysign(1, X). */
> + /* Transform (X >= 0.0 ? 1.0 : -1.0) into copysign(1, X). */
>
On Sun, Jun 25, 2017 at 12:08 PM, Yuri Gribov wrote:
> Hi all,
>
> Libgcc unwinder currently does not do any verification of pointers
> which it chases on stack. In practice this not so rarely causes
> segfaults when unwinding on corrupted stacks (e.g. when when trying to
> print diagnostic on
> f
On Sun, Jun 25, 2017 at 11:18 AM, Andrew Pinski wrote:
> On Sun, Jun 25, 2017 at 1:28 AM, Marc Glisse wrote:
>> +(for cmp (gt ge lt le)
>> + outp (convert convert negate negate)
>> + outn (negate negate convert convert)
>> + /* Transform (X > 0.0 ?
On Tue, Jun 6, 2017 at 3:56 AM, Renlin Li wrote:
> Hi all,
>
> In this patch, a new integer register operand modifier 'r' is added. This
> will use the
> proper register name according to the mode of corresponding operand.
>
> 'w' register for scalar integer mode smaller than DImode
> 'x' register
On Sat, Jun 24, 2017 at 4:53 PM, Andrew Pinski wrote:
> On Mon, Jun 12, 2017 at 12:56 AM, Tamar Christina
> wrote:
>> Hi All,
>>
>> this patch implements a optimization rewriting
>>
>> x * copysign (1.0, y) and
>> x * copysign (-1.0, y)
>
>
&g
On Tue, Jun 27, 2017 at 8:27 AM, Renlin Li wrote:
> Hi Andrew,
>
> On 25/06/17 22:38, Andrew Pinski wrote:
>>
>> On Tue, Jun 6, 2017 at 3:56 AM, Renlin Li wrote:
>>>
>>> Hi all,
>>>
>>> In this patch, a new integer register operand modifie
t;> >> + (BUILT_IN_COPYSIGN { build_one_cst (type); } (outp @0)))
>>> >> +(if (types_match (type, long_double_type_node))
>>> >> + (BUILT_IN_COPYSIGNL { build_one_cst (type); } (outp @0))
>>> >>
>>
>>Hi,
>>
>>Out of curiosity is there any reason why this transformation can't be
>>more general?
>>
>>e.g. Transform (X > 0.0 ? CST : -CST) into copysign(CST, X).
>
> That's also possible, yes.
I will be implementing that latter today.
Thanks,
Andrew Pinski
>
>>we would at the very least avoid a csel or a branch then.
>>
>>Regards,
>>Tamar
>
e
> Renlin Li
> Bin Cheng
>
> * config/aarch64/aarch64-simd.md (vec_cmp): New pattern.
> (vec_cmp): New pattern.
> (vec_cmpu): New pattern.
> (vcond_mask_): New pattern.
LTGT support is missing and can be generated via __built
? Bootstrapped and tested on aarch64-linux-gnu with no regressions.
Thanks,
Andrew Pinski
ChangeLog:
* tree-if-conv.c (predicate_scalar_phi): Update new_stmt if fold_stmt
returned true.
testsuite/ChangeLog:
* gcc.dg/torture/pr81245.c: New testcase.
Index: testsuite/gcc.dg/torture/pr81245.c
On Fri, Jun 30, 2017 at 1:20 AM, Richard Biener
wrote:
> On Thu, Jun 29, 2017 at 10:12 PM, Andrew Pinski wrote:
>> Hi,
>> As described in the bug, tree-if-conv is calling update_stmt on an
>> old stmt which might have been removed from the IR already
>> (transforming
On Tue, Nov 14, 2017 at 6:00 PM, Luis Machado wrote:
> Disabling software prefetching and switching the autoprefetcher to weak
> improves
> CPU2017 rate and speed benchmarks for both int and fp sets on Falkor.
>
> SPECrate 2017 fp is up 0.38%
> SPECspeed 2017 fp is up 0.54%
> SPECrate 2017 int is
601 - 700 of 2416 matches
Mail list logo