[PATCH 1/3]middle-end: Refactor vectorizer loop conditionals and separate out IV to new variables

2023-10-02 Thread Tamar Christina
Hi All, This is extracted out of the patch series to support early break vectorization in order to simplify the review of that patch series. The goal of this one is to separate out the refactoring from the new functionality. This first patch separates out the vectorizer's definition of an exit t

[PATCH 2/3]middle-end: updated niters analysis to handle multiple exits.

2023-10-02 Thread Tamar Christina
Hi All, This second part updates niters analysis to be able to analyze any number of exits. If we have multiple exits we determine the main exit by finding the first counting IV. The change allows the vectorizer to pass analysis for multiple loops, but we later gracefully reject them. It does h

[PATCH 3/3]middle-end: maintain LCSSA throughout loop peeling

2023-10-02 Thread Tamar Christina
Hi All, This final patch updates peeling to maintain LCSSA all the way through. It's significantly easier to maintain it during peeling while we still know where all new edges connect rather than touching it up later as is currently being done. This allows us to remove many of the helper functio

[PATCH]middle-end: Recursively check is_trivially_copyable_or_pair in vec.h

2023-10-02 Thread Tamar Christina
Hi All, I recently committed a patch that uses a nested std::pair in the second argument. It temporarily adds a second ranking variable for sorting and then later drops it. This hits the newly added assert in vec.h. This assert made some relaxation for std::pair but doesn't allow this case thr

RE: [PATCH]middle-end: Recursively check is_trivially_copyable_or_pair in vec.h

2023-10-02 Thread Tamar Christina
> -Original Message- > From: Jakub Jelinek > Sent: Monday, October 2, 2023 2:21 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; jwak...@redhat.com > Subject: Re: [PATCH]middle-end: Recursively check > is_trivially_copyable_or_pair in vec.h > > On

RE: [PATCH]middle-end: Recursively check is_trivially_copyable_or_pair in vec.h

2023-10-03 Thread Tamar Christina
> -Original Message- > From: Jakub Jelinek > Sent: Monday, October 2, 2023 2:21 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; jwak...@redhat.com > Subject: Re: [PATCH]middle-end: Recursively check > is_trivially_copyable_or_pair in vec.h > > On

RE: [PATCH]middle-end: Recursively check is_trivially_copyable_or_pair in vec.h

2023-10-03 Thread Tamar Christina
> -Original Message- > From: Jakub Jelinek > Sent: Tuesday, October 3, 2023 12:02 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; jwak...@redhat.com > Subject: Re: [PATCH]middle-end: Recursively check > is_trivially_copyable_or_pair in vec.h > >

RE: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-04 Thread Tamar Christina
Hi Robin, > -Original Message- > From: Robin Dapp > Sent: Wednesday, October 4, 2023 8:54 AM > To: Tamar Christina ; gcc-patches patc...@gcc.gnu.org>; Richard Biener > Cc: rdapp@gmail.com > Subject: Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar

RE: [PATCH]middle-end: Recursively check is_trivially_copyable_or_pair in vec.h

2023-10-05 Thread Tamar Christina
> On Tue, Oct 03, 2023 at 11:41:01AM +0000, Tamar Christina wrote: > > > We have stablesort method instead of qsort but that would require > > > consistent ordering in the vector (std::sort doesn't ensure stable > > > sorting either). > > > > >

RE: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar reduction.

2023-10-05 Thread Tamar Christina
Hi Robin, > -Original Message- > From: Robin Dapp > Sent: Thursday, October 5, 2023 3:06 PM > To: Tamar Christina ; gcc-patches patc...@gcc.gnu.org>; Richard Biener > Cc: rdapp@gmail.com > Subject: Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar

RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 << signbit(x)) [PR109154]

2023-10-05 Thread Tamar Christina
> > b17e1136600a 100644 > > --- a/gcc/match.pd > > +++ b/gcc/match.pd > > @@ -9476,3 +9476,57 @@ and, > > } > > (if (full_perm_p) > > (vec_perm (op@3 @0 @1) @3 @2)) > > + > > +/* Transform fneg (fabs (X)) -> X | 1 << signbit (X). */ > > + > > +(simplify > > + (negate (abs @

RE: [PATCH]middle-end match.pd: optimize fneg (fabs (x)) to x | (1 << signbit(x)) [PR109154]

2023-10-05 Thread Tamar Christina
> I suppose the idea is that -abs(x) might be easier to optimize with other > patterns (consider a - copysign(x,...), optimizing to a + abs(x)). > > For abs vs copysign it's a canonicalization, but (negate (abs @0)) is less > canonical than copysign. > > > Should I try removing this? > > I'd say

RE: [PATCH]AArch64: Use SVE unpredicated LOGICAL expressions when Advanced SIMD inefficient [PR109154]

2023-10-05 Thread Tamar Christina
> >> > >> The WIP SME patches add a %Z modifier for 'z' register prefixes, > >> similarly to b/h/s/d for scalar FP. With that I think the alternative can > >> be: > >> > >> [w , 0 , ; * , sve ] \t%Z0., %Z0., #%2 > >> > >> although it would be nice to keep the hex constant. > > > > My

RE: [PATCH]AArch64 Add special patterns for creating DI scalar and vector constant 1 << 63 [PR109154]

2023-10-05 Thread Tamar Christina
Hi, > The lowpart_subreg should simplify this back into CONST0_RTX (mode), > making it no different from: > > emti_move_insn (target, CONST0_RTX (mode)); > > If the intention is to share zeros between modes (sounds good!), then I think > the subreg needs to be on the lhs instead. > > > +

[PATCH]middle-end ifcvt: Allow any const IFN in conditional blocks

2023-10-05 Thread Tamar Christina
Hi All, When ifcvt was initially added masking was not a thing and as such it was rather conservative in what it supported. For builtins it only allowed C99 builtin functions which it knew it can fold away. These days the vectorizer is able to deal with needing to mask IFNs itself. vectorizable_

[PATCH]AArch64 Handle copysign (x, -1) expansion efficiently

2023-10-05 Thread Tamar Christina
Hi All, copysign (x, -1) is effectively fneg (abs (x)) which on AArch64 can be most efficiently done by doing an OR of the signbit. The middle-end will optimize fneg (abs (x)) now to copysign as the canonical form and so this optimizes the expansion. If the target has an inclusive-OR that takes

[PATCH]middle-end ifcvt: Add support for conditional copysign

2023-10-05 Thread Tamar Christina
Hi All, This adds a masked variant of copysign. Nothing very exciting just the general machinery to define and use a new masked IFN. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Note: This patch is part of a testseries and tests for it are added in the AArch64 patch that adds

[PATCH]AArch64 Add SVE implementation for cond_copysign.

2023-10-05 Thread Tamar Christina
Hi All, This adds an implementation for masked copysign along with an optimized pattern for masked copysign (x, -1). Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: PR tree-optimization/109154 * config/aarch64/aarch64

RE: [PATCH]AArch64 Add SVE implementation for cond_copysign.

2023-10-05 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, October 5, 2023 8:29 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; Kyrylo Tkachov > Subject: Re: [PATCH]AArch64 Add SVE implementation for c

RE: [PATCH]AArch64 Add SVE implementation for cond_copysign.

2023-10-05 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Thursday, October 5, 2023 9:26 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; Kyrylo Tkachov > Subject: Re: [PATCH]AArch64 Add SVE implementation for c

RE: [PATCH]middle-end: Recursively check is_trivially_copyable_or_pair in vec.h

2023-10-05 Thread Tamar Christina
> > On Thu, Oct 05, 2023 at 02:01:40PM +, Tamar Christina wrote: > > gcc/ChangeLog: > > > > * tree-if-conv.cc (INCLUDE_ALGORITHM): Remove. > > (typedef struct ifcvt_arg_entry): New. > > (cmp_arg_entry): New. > > (gen

RE: [PATCH]AArch64 Add SVE implementation for cond_copysign.

2023-10-09 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Saturday, October 7, 2023 10:58 AM > To: Richard Biener > Cc: Tamar Christina ; gcc-patches@gcc.gnu.org; > nd ; Richard Earnshaw ; > Marcus Shawcroft ; Kyrylo Tkachov > > Subject: Re: [PATCH]AArch64

RE: [PATCH]AArch64 Add SVE implementation for cond_copysign.

2023-10-09 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, October 9, 2023 10:45 AM > To: Tamar Christina > Cc: Richard Sandiford ; gcc- > patc...@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; Kyrylo Tkachov > Subject: Re: [PATCH]AArch64

RE: [PATCH]AArch64 Add SVE implementation for cond_copysign.

2023-10-09 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, October 9, 2023 10:56 AM > To: Tamar Christina > Cc: Richard Biener ; gcc-patches@gcc.gnu.org; > nd ; Richard Earnshaw ; > Marcus Shawcroft ; Kyrylo Tkachov > > Subject: Re: [PATCH]AArch64

RE: [PATCH 1/3]middle-end: Refactor vectorizer loop conditionals and separate out IV to new variables

2023-10-11 Thread Tamar Christina
> > @@ -2664,7 +2679,7 @@ slpeel_update_phi_nodes_for_loops > (loop_vec_info loop_vinfo, > > for correct vectorization of live stmts. */ > >if (loop == first) > > { > > - basic_block orig_exit = single_exit (second)->dest; > > + basic_block orig_exit = second_loop_e->dest;

RE: [PATCH 2/3]middle-end: updated niters analysis to handle multiple exits.

2023-10-11 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Tuesday, October 10, 2023 12:14 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: Re: [PATCH 2/3]middle-end: updated niters analysis to handle > multiple exits. > >

RE: [PATCH 3/3]middle-end: maintain LCSSA throughout loop peeling

2023-10-11 Thread Tamar Christina
> > + auto loop_exits = get_loop_exit_edges (loop); > > + auto_vec doms; > > + > >if (at_exit) /* Add the loop copy at exit. */ > > { > > - if (scalar_loop != loop) > > + if (scalar_loop != loop && new_exit->dest != exit_dest) > > { > > - gphi_iterator gsi; > > ne

[PATCH 5/6]AArch64: Fix Armv9-a warnings that get emitted whenever a ACLE header is used.

2023-10-12 Thread Tamar Christina
Hi All, At the moment, trying to use -march=armv9-a with any ACLE header such as arm_neon.h results in rows and rows of warnings saying: : warning: "__ARM_ARCH" redefined : note: this is the location of the previous definition This is obviously not useful and happens because the header was defin

RE: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-12 Thread Tamar Christina
as it does today. This will make it possible to still debug these utilities easily as is done today. Cheers, Tamar > -Original Message- > From: Robin Dapp > Sent: Thursday, October 12, 2023 9:45 PM > To: gcc-patches > Cc: rdapp@gmail.com; jeffreyalaw ; Tama

RE: [PATCH] genemit: Split insn-emit.cc into ten files.

2023-10-13 Thread Tamar Christina
> -Original Message- > From: Robin Dapp > Sent: Friday, October 13, 2023 4:15 PM > To: gcc-patches > Cc: rdapp@gmail.com; jeffreyalaw ; Tamar > Christina ; rjie...@linux.alibaba.com > Subject: Re: [PATCH] genemit: Split insn-emit.cc into ten files. > >

[PATCH]middle-end: don't create LC-SSA PHI variables for PHI nodes who dominate loop

2023-10-19 Thread Tamar Christina
Hi All, As the testcase shows, when a PHI node dominates the loop there is no new definition inside the loop. As such there would be no PHI nodes to update. When we maintain LCSSA form we create an intermediate node in between the two loops to thread alongt the value. However later on when we u

[committed][aarch64][gcc][patch] Fix -Wpedantic issue with testcase.

2018-07-06 Thread Tamar Christina
Hi All, This fixes a -Wpedantic error with the testcase because of extra ; left after the functions. Regtested single test on aarch64-none-elf and no issues. Committed under the GCC obvious rule. Thanks, Tamar gcc/testsuite/ 2018-07-06 Tamar Christina * gcc.target/aarch64

[committed][gcc][patch] Require sse for testcase on i686.

2018-07-06 Thread Tamar Christina
Hi All, This fixes an ABI warning generated on i686-pc-linux-gnu when using `vector_size` with no sse enabled explicitly. Regtested single test on x86_64-pc-linux-gnu with -m32 and no issues. Committed under the GCC obvious rule. Thanks, Tamar gcc/testsuite/ 2018-07-06 Tamar Christina

RE: [committed][aarch64][gcc][patch] Fix -Wpedantic issue with testcase.

2018-07-06 Thread Tamar Christina
-pedantic-errors" > > So looks like a botched test run then. My bad.. > > Ramana > > > > > >> Ramana > >> > > >> > Thanks, > >> > Tamar > >> > > >> > gcc/testsuite/ > >> > 2018-07-06 Tamar Christina > >> > > >> > * gcc.target/aarch64/struct_cpy.c: Remove ;. > >> > > >> > --

RE: [PATCH][GCC][ARM] Fix can_change_mode_class for big-endian

2018-07-06 Thread Tamar Christina
Hi Christoph, > 90d62e699bce9594879be2e3016c9b36c7e064c8..703632240822e762a90657096 > > >> 4b949c783df56f3 100644 > > >> --- a/gcc/config/arm/arm.c > > >> +++ b/gcc/config/arm/arm.c > > >> @@ -31508,8 +31508,8 @@ arm_can_change_mode_class > (machine_mode from, machine_mode to, > > >>{ > > >>

Re: [committed][gcc][patch] Require sse for testcase on i686.

2018-07-07 Thread Tamar Christina
Hi Jakub, > > On Fri, Jul 06, 2018 at 11:46:43AM +0100, Tamar Christina wrote: > > This fixes an ABI warning generated on i686-pc-linux-gnu when using > > `vector_size` with no sse enabled explicitly. > > > > Regtested single test on x86_64-pc-linux-gnu with -m32 an

[PATCH][GCC][AArch64][mid-end] Updated stack-clash implementation for AArch64. [patch (0/6)]

2018-07-11 Thread Tamar Christina
Hi All, The patch series will allow AArch64 to use 64k guard sizes correctly and improves the code quality. It also enables a reduction of the overhead in code size over the current GCC 8 implementation. Using 64k guard sizes results in a reduction in overhead compared to the 4k guard size. Th

[PATCH][GCC][AArch64] Cleanup the AArch64 testsuite when stack-clash is on [Patch (6/6)]

2018-07-11 Thread Tamar Christina
on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Both targets were tested with stack clash on and off by default. Ok for trunk? Thanks, Tamar gcc/testsuite/ 2018-07-11 Tamar Christina PR target/86486 gcc.dg/pr82788.c: Skip for AArch64. gcc.dg/guality/vla-

[PATCH][GCC][AArch64] Updated stack-clash implementation supporting 64k probes. [patch (1/6)]

2018-07-11 Thread Tamar Christina
. Target was tested with stack clash on and off by default. Ok for trunk? Thanks, Tamar gcc/ 2018-07-11 Jeff Law Richard Sandiford Tamar Christina PR target/86486 * config/aarch64/aarch64.md (cmp, probe_stack_range): Add k (SP) constraint

[PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-11 Thread Tamar Christina
the user at configure time. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Both targets were tested with stack clash on and off by default. Ok for trunk? Thanks, Tamar gcc/ 2018-07-11 Tamar Christina PR target/86486 * configure.ac: Add

[PATCH][GCC][mid-end] Add a hook to support telling the mid-end when to probe the stack [patch (2/6)]

2018-07-11 Thread Tamar Christina
r needed. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Both targets were tested with stack clash on and off by default. Ok for trunk? Thanks, Tamar gcc/ 2018-07-11 Tamar Christina PR target/86486 * explow.c (anti_adjust_stack_and_probe_sta

[PATCH][GCC][front-end][opt-framework] Update options framework for parameters to properly handle and validate configure time params. [Patch (2/3)]

2018-07-11 Thread Tamar Christina
ithin the valid range. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Both targets were tested with stack clash on and off by default. Ok for trunk? Thanks, Tamar gcc/ 2018-07-11 Tamar Christina * params.h (struct param_info): Add config

[PATCH][GCC][AArch64] Validate and set default parameters for stack-clash. [Patch (3/3)]

2018-07-11 Thread Tamar Christina
. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Target was tested with stack clash on and off by default. Ok for trunk? Thanks, Tamar gcc/ 2018-07-11 Tamar Christina * common/config/aarch64/aarch64-common.c (TARGET_OPTION_DEFAULT_PARAM

[PATCH][GCC][front-end][opt-framework] Allow back-ends to be able to do custom validations on params. [Patch (1/3)]

2018-07-11 Thread Tamar Christina
apped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Both targets were tested with stack clash on and off by default. Ok for trunk? Thanks, Tamar gcc/ 2018-07-11 Tamar Christina * params.c (set_param_value): Add index of parameter being vali

[PATCH][GCC][AArch64] Ensure that outgoing argument size is at least 8 bytes when alloca and stack-clash. [Patch (3/6)]

2018-07-11 Thread Tamar Christina
for trunk? Thanks, Tamar gcc/ 2018-07-11 Tamar Christina PR target/86486 * config/aarch64/aarch64.h (STACK_CLASH_OUTGOING_ARGS, STACK_DYNAMIC_OFFSET): New. * config/aarch64/aarch64.c (aarch64_layout_frame): Update outgoing

[PATCH][GCC][AArch64] Set default values for stack-clash and do basic validation in back-end. [Patch (5/6)]

2018-07-11 Thread Tamar Christina
size is only 4KB or 64KB and also enforces that for aarch64 the stack-clash probing interval is equal to the guard size. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Target was tested with stack clash on and off by default. Ok for trunk? Thanks, Tamar gcc/ 2018-07-11 Tamar

Re: [PATCH][GCC][AArch64] Updated stack-clash implementation supporting 64k probes. [patch (1/6)]

2018-07-12 Thread Tamar Christina
Hi Jeff, Thanks for the review! The 07/11/2018 18:30, Jeff Law wrote: > On 07/11/2018 05:20 AM, Tamar Christina wrote: > > Hi All, > > > > This patch implements the use of the stack clash mitigation for aarch64. > > In Aarch64 we expect both the probing interval an

Re: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-12 Thread Tamar Christina
Hi Jeff, The 07/11/2018 20:21, Jeff Law wrote: > On 07/11/2018 05:22 AM, Tamar Christina wrote: > > Hi All, > > > > This patch defines a configure option to allow the setting of the default > > guard size via configure flags when building the target. > > > &

RE: [PATCH][GCC][AArch64] Ensure that outgoing argument size is at least 8 bytes when alloca and stack-clash. [Patch (3/6)]

2018-07-13 Thread Tamar Christina
Hi All, I'm sending an updated patch which updates a testcase that hits one of our corner cases. This is an assembler scan only update in a testcase. Regards, Tamar > -Original Message- > From: Tamar Christina > Sent: Wednesday, July 11, 2018 12:21 > To: gcc-patches

RE: [PATCH][GCC][AArch64] Validate and set default parameters for stack-clash. [Patch (3/3)]

2018-07-13 Thread Tamar Christina
Hi All, I am sending an updated patch which takes into account a case where the set parameter value would not be safe to call. No change in the cover letter. Regards, Tamar > -Original Message- > From: Tamar Christina > Sent: Wednesday, July 11, 2018 12:25 > To:

RE: [PATCH][GCC][AArch64] Updated stack-clash implementation supporting 64k probes. [patch (1/6)]

2018-07-13 Thread Tamar Christina
Hi All, I'm sending an updated patch which makes sure unwind tables are disabled always for tests that do sequence checks so they pass in all configurations. There's no change to the cover letter or implementation. Regards, Tamar > -Original Message- > From: Tamar C

Re: [PATCH][GCC][AArch64] Updated stack-clash implementation supporting 64k probes. [patch (1/6)]

2018-07-16 Thread Tamar Christina
The 07/13/2018 17:46, Jeff Law wrote: > On 07/12/2018 11:39 AM, Tamar Christina wrote: > >>> + > >>> + /* Round size to the nearest multiple of guard_size, and calculate the > >>> + residual as the difference between the origina

RE: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-19 Thread Tamar Christina
Hi Jeff, > -Original Message- > From: Tamar Christina > Sent: Thursday, July 12, 2018 18:45 > To: Jeff Law > Cc: gcc-patches@gcc.gnu.org; nd ; > jos...@codesourcery.com; bonz...@gnu.org; d...@redhat.com; > nero...@gcc.gnu.org; aol...@redhat.com; ralf.wildenh...

RE: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-20 Thread Tamar Christina
Hi Jeff, > -Original Message- > From: Jeff Law > Sent: Thursday, July 19, 2018 18:32 > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; > jos...@codesourcery.com; bonz...@gnu.org; d...@redhat.com; > nero...@gcc.gnu.org; aol...@redhat.com; ralf.wildenh...

RE: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-20 Thread Tamar Christina
> > On 07/20/2018 05:03 AM, Tamar Christina wrote: > >> Understood. Thanks for verifying. I wonder if we could just bury > >> this entirely in the aarch64 config files and not expose the default into > params.def? > >> > > > > Burying it in con

[PATCH][GCC][Arm] Cleanup up reg to reg move in neon_mov.

2018-07-23 Thread Tamar Christina
can not handle this and simply returns "" and asserts. This pattern is essentially dead and I'm removing it for clarity. Regtested on armeb-none-eabi and no regressions. Bootstrapped on arm-none-linux-gnueabihf and no issues. Ok for trunk? Thanks, Tamar gcc/ 2018-07-23

[PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

2018-07-23 Thread Tamar Christina
cution test Regtested on armeb-none-eabi and no regressions. Bootstrapped on arm-none-linux-gnueabihf and no issues. Ok for trunk? Thanks, Tamar gcc/ 2018-07-23 Tamar Christina PR target/84711 * config/arm/arm.c (arm_can_change_mode_class): Disallow subreg. * conf

[PATCH][GCC][mid-end] Allow larger copies when not slow_unaligned_access and no padding.

2018-07-23 Thread Tamar Christina
and found no issues. OK for trunk? Thanks, Tamar gcc/ 2018-07-23 Tamar Christina * expr.c (copy_blkmode_to_reg): Perform larger copies when safe. -- diff --git a/gcc/expr.c b/gcc/expr.c index f665e187ebbbc7874ec88e84ca47ed991491c3e5..17b580aabf761491d8003ac74daa014bc252ea9f 100644

RE: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-24 Thread tamar . christina
explicitly is they want to support this configure flag and values that users may have set. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Both targets were tested with stack clash on and off by default. Ok for trunk? Thanks, Tamar gcc/ 2018-07-24 Tamar Christina

RE: [PATCH][GCC][AArch64] Set default values for stack-clash and do basic validation in back-end. [Patch (5/6)]

2018-07-24 Thread tamar . christina
tested with stack clash on and off by default. Ok for trunk? Thanks, Tamar gcc/ 2018-07-24 Tamar Christina PR target/86486 * config/aarch64/aarch64.c (aarch64_override_options_internal): Add validation for stack-clash parameters and set defaults. > -Original Mess

RE: [PATCH][GCC][AArch64] Cleanup the AArch64 testsuite when stack-clash is on [Patch (6/6)]

2018-07-24 Thread tamar . christina
on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Both targets were tested with stack clash on and off by default. Ok for trunk? Thanks, Tamar gcc/testsuite/ 2018-07-24 Tamar Christina PR target/86486 * gcc.dg/pr82788.c: Skip for AArch64. * gcc.dg/guality/vla-

RE: [PATCH][GCC][front-end][opt-framework] Update options framework for parameters to properly handle and validate configure time params. [Patch (2/3)]

2018-07-24 Thread tamar . christina
still keep the parameters within the valid range. Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no issues. Both targets were tested with stack clash on and off by default. Ok for trunk? Thanks, Tamar gcc/ 2018-07-24 Tamar Christina * params.c

Re: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-24 Thread Tamar Christina
Hi All, Here's an updated patch with documentation. Ok for trunk? Thanks, Tamar gcc/ 2018-07-24 Tamar Christina PR target/86486 * configure.ac: Add stack-clash-protection-guard-size. * doc/install.texi: Document it. * config.in (DEFAULT_STK_CLASH_GUARD

Re: [PATCH][GCC][mid-end] Allow larger copies when not slow_unaligned_access and no padding.

2018-07-24 Thread Tamar Christina
Hi Richard, Thanks for the review! The 07/23/2018 18:46, Richard Biener wrote: > On July 23, 2018 7:01:23 PM GMT+02:00, Tamar Christina > wrote: > >Hi All, > > > >This allows copy_blkmode_to_reg to perform larger copies when it is > >safe to do so by calculatin

RE: [PATCH][GCC][AArch64] Updated stack-clash implementation supporting 64k probes. [patch (1/6)]

2018-07-25 Thread Tamar Christina
Hi All, Attached is an updated patch that clarifies some of the comments in the patch and adds comments to the individual testcases as requested. Ok for trunk? Thanks, Tamar gcc/ 2018-07-25 Jeff Law Richard Sandiford Tamar Christina PR target/86486

RE: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-25 Thread Tamar Christina
HI Alexandre, Thanks for the review. Attached is the updated patch and new changelog below: Thanks, Tamar gcc/ 2018-07-25 Tamar Christina PR target/86486 * configure.ac: Add stack-clash-protection-guard-size. * doc/install.texi: Document it. * config.in

RE: [PATCH][GCC][AArch64] Ensure that outgoing argument size is at least 8 bytes when alloca and stack-clash. [Patch (3/6)]

2018-07-25 Thread Tamar Christina
Hi All, Attached an updated patch which documents what the test cases are expecting as requested. Ok for trunk? Thanks, Tamar gcc/ 2018-07-25 Tamar Christina PR target/86486 * config/aarch64/aarch64.h (STACK_CLASH_OUTGOING_ARGS, STACK_DYNAMIC_OFFSET): New

RE: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

2018-07-25 Thread Tamar Christina
t; Regtested on armeb-none-eabi and no regressions. > > Bootstrapped on arm-none-linux-gnueabihf and no issues. > > > > > > Ok for trunk? > > > > Thanks, > > Tamar > > > > gcc/ > > 2018-07-23 Tamar Christina > > > > PR target/84711 > > * config/arm/arm.c (arm_can_change_mode_class): Disallow subreg. > > * config/arm/neon.md (movv4hf, movv8hf): Refactored to.. > > (mov): ..this and enable unconditionally. > > > > --

RE: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

2018-07-26 Thread Tamar Christina
Hi Thomas, > -Original Message- > From: Thomas Preudhomme > Sent: Thursday, July 26, 2018 09:29 > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Ramana Radhakrishnan > ; Richard Earnshaw > ; ni...@redhat.com; Kyrylo Tkachov > > Subject: Re: [PATCH]

RE: [PATCH][GCC][front-end][build-machinery][opt-framework] Allow setting of stack-clash via configure options. [Patch (4/6)]

2018-07-26 Thread Tamar Christina
Hi Alexandre, > -Original Message- > From: gcc-patches-ow...@gcc.gnu.org > On Behalf Of Alexandre Oliva > Sent: Thursday, July 26, 2018 08:46 > To: Tamar Christina > Cc: Joseph Myers ; Jeff Law > ; gcc-patches@gcc.gnu.org; nd ; > bonz...@gnu.org; d...@redhat.

RE: [PATCH][GCC][Arm] Cleanup up reg to reg move in neon_mov.

2018-07-31 Thread Tamar Christina
Ping 😊 > -Original Message- > From: gcc-patches-ow...@gcc.gnu.org > On Behalf Of Tamar Christina > Sent: Monday, July 23, 2018 17:52 > To: gcc-patches@gcc.gnu.org > Cc: nd ; Ramana Radhakrishnan > ; Richard Earnshaw > ; ni...@redhat.com; Kyrylo Tkachov >

RE: [PATCH][GCC][Arm] Fix subreg crash in different way by enabling the FP16 pattern unconditionally.

2018-07-31 Thread Tamar Christina
Ping 😊 > -Original Message- > From: Thomas Preudhomme > Sent: Thursday, July 26, 2018 12:29 > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Ramana Radhakrishnan > ; Richard Earnshaw > ; ni...@redhat.com; Kyrylo Tkachov > > Subject: Re: [PATCH][GC

RE: [PATCH][GCC][mid-end] Allow larger copies when not slow_unaligned_access and no padding.

2018-07-31 Thread Tamar Christina
Ping 😊 > -Original Message- > From: gcc-patches-ow...@gcc.gnu.org > On Behalf Of Tamar Christina > Sent: Tuesday, July 24, 2018 17:34 > To: Richard Biener > Cc: gcc-patches@gcc.gnu.org; nd ; l...@redhat.com; > i...@airs.com; amo...@gmail.com; berg...@vnet.ibm.com

Re: [PATCH][GCC][mid-end] Allow larger copies when not slow_unaligned_access and no padding.

2018-07-31 Thread Tamar Christina
Hi Richard, The 07/31/2018 11:21, Richard Biener wrote: > On Tue, 31 Jul 2018, Tamar Christina wrote: > > > Ping 😊 > > > > > -Original Message- > > > From: gcc-patches-ow...@gcc.gnu.org > > > On Behalf Of Tamar Christina > > >

[PATCH]middle-end: don't pass loop_vinfo to vect_set_loop_condition during prolog peeling [PR111866]

2023-10-20 Thread Tamar Christina
Hi All, During the refactoring I had passed loop_vinfo on to vect_set_loop_condition during prolog peeling. This parameter is unused in most cases except for in vect_set_loop_condition_partial_vectors where it's behaviour depends on whether loop_vinfo is NULL or not. Apparently this code expect

[PATCH] middle-end: don't keep .MEM guard nodes for PHI nodes who dominate loop [PR111860]

2023-10-20 Thread Tamar Christina
Hi All, The previous patch tried to remove PHI nodes that dominated the first loop, however the correct fix is to only remove .MEM nodes. This patch thus makes the condition a bit stricter and only tries to remove MEM phi nodes. I couldn't figure out a way to easily determine if a particular PHI

RE: [PATCH 12/19]middle-end: implement loop peeling and IV updates for early break.

2023-10-23 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Friday, July 14, 2023 2:35 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com > Subject: RE: [PATCH 12/19]middle-end: implement loop peeling and IV > updates for early break. >

[PATCH v6 0/21]middle-end: Support early break/return auto-vectorization

2023-11-05 Thread Tamar Christina
Hi All, This patch adds initial support for early break vectorization in GCC. The support is added for any target that implements a vector cbranch optab, this includes both fully masked and non-masked targets. Depending on the operation, the vectorizer may also require support for boolean mask re

[PATCH 1/21]middle-end testsuite: Add more pragma novector to new tests

2023-11-05 Thread Tamar Christina
Hi All, This adds pragma GCC novector to testcases that have showed up since last regression run and due to this series detecting more. Is it ok that when it comes time to commit I can just update any new cases before committing? since this seems a cat and mouse game.. Bootstrapped Regtested on

[PATCH 3/21]middle-end: Implement code motion and dependency analysis for early breaks

2023-11-05 Thread Tamar Christina
Hi All, When performing early break vectorization we need to be sure that the vector operations are safe to perform. A simple example is e.g. for (int i = 0; i < N; i++) { vect_b[i] = x + i; if (vect_a[i]*2 != x) break; vect_a[i] = x; } where the store to vect_b is not allowed

[PATCH 4/21]middle-end: update loop peeling code to maintain LCSSA form for early breaks

2023-11-05 Thread Tamar Christina
Hi All, This splits the part of the function that does peeling for loops at exits to a different function. In this new function we also peel for early breaks. Peeling for early breaks works by redirecting all early break exits to a single "early break" block and combine them and the normal exit

[PATCH 6/21]middle-end: support multiple exits in loop versioning

2023-11-05 Thread Tamar Christina
Hi All, This has loop versioning use the vectorizer's IV exit edge when it's available since single_exit (..) fails with multiple exits. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * tree-vect-loop-manip.cc (vect_loop_ver

[PATCH 5/21]middle-end: update vectorizer's control update to support picking an exit other than loop latch

2023-11-05 Thread Tamar Christina
Hi All, As requested, the vectorizer is now free to pick it's own exit which can be different than what the loop CFG infrastucture uses. The vectorizer makes use of this to vectorize loops that it previously could not. But this means that loop control must be materialized in the block that needs

[PATCH 7/21]middle-end: update IV update code to support early breaks and arbitrary exits

2023-11-05 Thread Tamar Christina
Hi All, This changes the PHI node updates to support early breaks. It has to support both the case where the loop's exit matches the normal loop exit and one where the early exit is "inverted", i.e. it's an early exit edge. In the latter case we must always restart the loop for VF iterations. Fo

[PATCH 8/21]middle-end: update vectorizable_live_reduction with support for multiple exits and different exits

2023-11-05 Thread Tamar Christina
Hi All, This adds support to vectorizable_live_reduction to handle multiple exits by doing a search for which exit the live value should be materialized in. Additinally which value in the index we're after depends on whether the exit it's materialized in is an early exit or whether the loop's mai

[PATCH 10/21]middle-end: implement relevancy analysis support for control flow

2023-11-05 Thread Tamar Christina
Hi All, This updates relevancy analysis to support marking gcond's belonging to early breaks as relevant for vectorization. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * tree-vect-stmts.cc (vect_stmt_relevant_p, v

[PATCH 9/21]middle-end: implement vectorizable_early_exit for codegen of exit code

2023-11-05 Thread Tamar Christina
Hi All, This implements vectorable_early_exit which is used as the codegen part of vectorizing a gcond. For the most part it shares the majority of the code with vectorizable_comparison with addition that it needs to be able to reduce multiple resulting statements into a single one for use in the

[PATCH 11/21]middle-end: wire through peeling changes and dominator updates after guard edge split

2023-11-05 Thread Tamar Christina
Hi All, This wires through the final bits to support adding the guard block between the loop and epilog. For an "inverted loop", i.e. one where an early exit was chosen as the main exit then we can never skip the scalar loop since we know we have side effects to still perform. For those cases we

[PATCH 12/21]middle-end: Add remaining changes to peeling and vectorizer to support early breaks

2023-11-05 Thread Tamar Christina
Hi All, This finishes wiring that didn't fit in any of the other patches. Essentially just adding related changes so peeling for early break works. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * tree-vect-loop-manip.cc (ve

[PATCH 13/21]middle-end: Update loop form analysis to support early break

2023-11-05 Thread Tamar Christina
Hi All, This sets LOOP_VINFO_EARLY_BREAKS and does some misc changes so the other patches are self contained. Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. Ok for master? Thanks, Tamar gcc/ChangeLog: * tree-vect-loop.cc (vect_analyze_loop_form): Analyse all exits.

[PATCH 14/21]middle-end: Change loop analysis from looking at at number of BB to actual cfg

2023-11-05 Thread Tamar Christina
Hi All, The vectorizer at the moment uses a num_bb check to check for control flow. This rejects a number of loops with no reason. Instead this patch changes it to check the destination of the exits instead. This also allows early break to work by also dropping the single_exit check. Bootstrapp

[PATCH 15/21]middle-end: [RFC] conditionally support forcing final edge for debugging

2023-11-05 Thread Tamar Christina
Hi All, What do people think about having the ability to force only the latch connected exit as the exit as a param? I.e. what's in the patch but as a param. I found this useful when debugging large example failures as it tells me where I should be looking. No hard requirement but just figured I

[PATCH 16/21]middle-end testsuite: un-xfail TSVC loops that check for exit control flow vectorization

2023-11-05 Thread Tamar Christina
Hi All, I didn't want these to get lost in the noise of updates. The following three tests now correctly work for targets that have an implementation of cbranch for vectors so XFAILs are conditionally removed gated on vect_early_break support. Bootstrapped Regtested on aarch64-none-linux-gnu and

[PATCH 17/21]AArch64: Add implementation for vector cbranch for Advanced SIMD

2023-11-05 Thread Tamar Christina
Hi All, This adds an implementation for conditional branch optab for AArch64. For e.g. void f1 () { for (int i = 0; i < N; i++) { b[i] += a[i]; if (a[i] > 0) break; } } For 128-bit vectors we generate: cmgtv1.4s, v1.4s, #0 umaxp v1.4s, v1.4s,

[PATCH 18/21]AArch64: Add optimization for vector != cbranch fed into compare with 0 for Advanced SIMD

2023-11-05 Thread Tamar Christina
Hi All, Advanced SIMD lacks a cmpeq for vectors, and unlike compare to 0 we can't rewrite to a cmtst. This operation is however fairly common, especially now that we support early break vectorization. As such this adds a pattern to recognize the negated any comparison and transform it to an all.

[PATCH 21/21]Arm: Add MVE cbranch implementation

2023-11-05 Thread Tamar Christina
Hi All, This adds an implementation for conditional branch optab for MVE. Unfortunately MVE has rather limited operations on VPT.P0, we are missing the ability to do P0 comparisons and logical OR on P0. For that reason we can only support cbranch with 0, as for comparing to a 0 predicate we don'

[PATCH 19/21]AArch64: Add optimization for vector cbranch combining SVE and Advanced SIMD

2023-11-05 Thread Tamar Christina
Hi All, Advanced SIMD lacks flag setting vector comparisons which SVE adds. Since machines with SVE also support Advanced SIMD we can use the SVE comparisons to perform the operation in cases where SVE codegen is allowed, but the vectorizer has decided to generate Advanced SIMD because of loop

[PATCH 20/21]Arm: Add Advanced SIMD cbranch implementation

2023-11-05 Thread Tamar Christina
Hi All, This adds an implementation for conditional branch optab for AArch32. For e.g. void f1 () { for (int i = 0; i < N; i++) { b[i] += a[i]; if (a[i] > 0) break; } } For 128-bit vectors we generate: vcgt.s32q8, q9, #0 vpmax.u32 d7,

[PATCH v3 1/2]middle-end: expand copysign handling from lockstep to nested iters

2023-11-06 Thread Tamar Christina
Hi All, various optimizations in match.pd only happened on COPYSIGN in lock step which means they exclude IFN_COPYSIGN. COPYSIGN however is restricted to only the C99 builtins and so doesn't work for vectors. The patch expands these optimizations to work as nested iters. This is needed for the

[PATCH v3 2/2]middle-end match.pd: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154]

2023-11-06 Thread Tamar Christina
Hi All, This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more canonical and allows a target to expand this sequence efficiently. Such sequences are common in scientific code working with gradients. There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x)) w

  1   2   3   4   5   6   7   8   9   10   >