Hi All,
This is extracted out of the patch series to support early break vectorization
in order to simplify the review of that patch series.
The goal of this one is to separate out the refactoring from the new
functionality.
This first patch separates out the vectorizer's definition of an exit t
Hi All,
This second part updates niters analysis to be able to analyze any number of
exits. If we have multiple exits we determine the main exit by finding the
first counting IV.
The change allows the vectorizer to pass analysis for multiple loops, but we
later gracefully reject them. It does h
Hi All,
This final patch updates peeling to maintain LCSSA all the way through.
It's significantly easier to maintain it during peeling while we still know
where all new edges connect rather than touching it up later as is currently
being done.
This allows us to remove many of the helper functio
Hi All,
I recently committed a patch that uses a nested std::pair in the second
argument.
It temporarily adds a second ranking variable for sorting and then later drops
it.
This hits the newly added assert in vec.h. This assert made some relaxation for
std::pair but doesn't allow this case thr
> -Original Message-
> From: Jakub Jelinek
> Sent: Monday, October 2, 2023 2:21 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; jwak...@redhat.com
> Subject: Re: [PATCH]middle-end: Recursively check
> is_trivially_copyable_or_pair in vec.h
>
> On
> -Original Message-
> From: Jakub Jelinek
> Sent: Monday, October 2, 2023 2:21 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; jwak...@redhat.com
> Subject: Re: [PATCH]middle-end: Recursively check
> is_trivially_copyable_or_pair in vec.h
>
> On
> -Original Message-
> From: Jakub Jelinek
> Sent: Tuesday, October 3, 2023 12:02 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; jwak...@redhat.com
> Subject: Re: [PATCH]middle-end: Recursively check
> is_trivially_copyable_or_pair in vec.h
>
>
Hi Robin,
> -Original Message-
> From: Robin Dapp
> Sent: Wednesday, October 4, 2023 8:54 AM
> To: Tamar Christina ; gcc-patches patc...@gcc.gnu.org>; Richard Biener
> Cc: rdapp@gmail.com
> Subject: Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar
> On Tue, Oct 03, 2023 at 11:41:01AM +0000, Tamar Christina wrote:
> > > We have stablesort method instead of qsort but that would require
> > > consistent ordering in the vector (std::sort doesn't ensure stable
> > > sorting either).
> > >
> >
Hi Robin,
> -Original Message-
> From: Robin Dapp
> Sent: Thursday, October 5, 2023 3:06 PM
> To: Tamar Christina ; gcc-patches patc...@gcc.gnu.org>; Richard Biener
> Cc: rdapp@gmail.com
> Subject: Re: [PATCH] ifcvt/vect: Emit COND_ADD for conditional scalar
> > b17e1136600a 100644
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -9476,3 +9476,57 @@ and,
> > }
> > (if (full_perm_p)
> > (vec_perm (op@3 @0 @1) @3 @2))
> > +
> > +/* Transform fneg (fabs (X)) -> X | 1 << signbit (X). */
> > +
> > +(simplify
> > + (negate (abs @
> I suppose the idea is that -abs(x) might be easier to optimize with other
> patterns (consider a - copysign(x,...), optimizing to a + abs(x)).
>
> For abs vs copysign it's a canonicalization, but (negate (abs @0)) is less
> canonical than copysign.
>
> > Should I try removing this?
>
> I'd say
> >>
> >> The WIP SME patches add a %Z modifier for 'z' register prefixes,
> >> similarly to b/h/s/d for scalar FP. With that I think the alternative can
> >> be:
> >>
> >> [w , 0 , ; * , sve ] \t%Z0., %Z0., #%2
> >>
> >> although it would be nice to keep the hex constant.
> >
> > My
Hi,
> The lowpart_subreg should simplify this back into CONST0_RTX (mode),
> making it no different from:
>
> emti_move_insn (target, CONST0_RTX (mode));
>
> If the intention is to share zeros between modes (sounds good!), then I think
> the subreg needs to be on the lhs instead.
>
> > +
Hi All,
When ifcvt was initially added masking was not a thing and as such it was
rather conservative in what it supported.
For builtins it only allowed C99 builtin functions which it knew it can fold
away.
These days the vectorizer is able to deal with needing to mask IFNs itself.
vectorizable_
Hi All,
copysign (x, -1) is effectively fneg (abs (x)) which on AArch64 can be
most efficiently done by doing an OR of the signbit.
The middle-end will optimize fneg (abs (x)) now to copysign as the
canonical form and so this optimizes the expansion.
If the target has an inclusive-OR that takes
Hi All,
This adds a masked variant of copysign. Nothing very exciting just the
general machinery to define and use a new masked IFN.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Note: This patch is part of a testseries and tests for it are added in the
AArch64 patch that adds
Hi All,
This adds an implementation for masked copysign along with an optimized
pattern for masked copysign (x, -1).
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
PR tree-optimization/109154
* config/aarch64/aarch64
> -Original Message-
> From: Richard Sandiford
> Sent: Thursday, October 5, 2023 8:29 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; Kyrylo Tkachov
> Subject: Re: [PATCH]AArch64 Add SVE implementation for c
> -Original Message-
> From: Richard Sandiford
> Sent: Thursday, October 5, 2023 9:26 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; Kyrylo Tkachov
> Subject: Re: [PATCH]AArch64 Add SVE implementation for c
>
> On Thu, Oct 05, 2023 at 02:01:40PM +, Tamar Christina wrote:
> > gcc/ChangeLog:
> >
> > * tree-if-conv.cc (INCLUDE_ALGORITHM): Remove.
> > (typedef struct ifcvt_arg_entry): New.
> > (cmp_arg_entry): New.
> > (gen
> -Original Message-
> From: Richard Sandiford
> Sent: Saturday, October 7, 2023 10:58 AM
> To: Richard Biener
> Cc: Tamar Christina ; gcc-patches@gcc.gnu.org;
> nd ; Richard Earnshaw ;
> Marcus Shawcroft ; Kyrylo Tkachov
>
> Subject: Re: [PATCH]AArch64
> -Original Message-
> From: Richard Biener
> Sent: Monday, October 9, 2023 10:45 AM
> To: Tamar Christina
> Cc: Richard Sandiford ; gcc-
> patc...@gcc.gnu.org; nd ; Richard Earnshaw
> ; Marcus Shawcroft
> ; Kyrylo Tkachov
> Subject: Re: [PATCH]AArch64
> -Original Message-
> From: Richard Sandiford
> Sent: Monday, October 9, 2023 10:56 AM
> To: Tamar Christina
> Cc: Richard Biener ; gcc-patches@gcc.gnu.org;
> nd ; Richard Earnshaw ;
> Marcus Shawcroft ; Kyrylo Tkachov
>
> Subject: Re: [PATCH]AArch64
> > @@ -2664,7 +2679,7 @@ slpeel_update_phi_nodes_for_loops
> (loop_vec_info loop_vinfo,
> > for correct vectorization of live stmts. */
> >if (loop == first)
> > {
> > - basic_block orig_exit = single_exit (second)->dest;
> > + basic_block orig_exit = second_loop_e->dest;
> -Original Message-
> From: Richard Biener
> Sent: Tuesday, October 10, 2023 12:14 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> Subject: Re: [PATCH 2/3]middle-end: updated niters analysis to handle
> multiple exits.
>
>
> > + auto loop_exits = get_loop_exit_edges (loop);
> > + auto_vec doms;
> > +
> >if (at_exit) /* Add the loop copy at exit. */
> > {
> > - if (scalar_loop != loop)
> > + if (scalar_loop != loop && new_exit->dest != exit_dest)
> > {
> > - gphi_iterator gsi;
> > ne
Hi All,
At the moment, trying to use -march=armv9-a with any ACLE header such as
arm_neon.h results in rows and rows of warnings saying:
: warning: "__ARM_ARCH" redefined
: note: this is the location of the previous definition
This is obviously not useful and happens because the header was defin
as it does today.
This will make it possible to still debug these utilities easily as is done
today.
Cheers,
Tamar
> -Original Message-
> From: Robin Dapp
> Sent: Thursday, October 12, 2023 9:45 PM
> To: gcc-patches
> Cc: rdapp@gmail.com; jeffreyalaw ; Tama
> -Original Message-
> From: Robin Dapp
> Sent: Friday, October 13, 2023 4:15 PM
> To: gcc-patches
> Cc: rdapp@gmail.com; jeffreyalaw ; Tamar
> Christina ; rjie...@linux.alibaba.com
> Subject: Re: [PATCH] genemit: Split insn-emit.cc into ten files.
>
>
Hi All,
As the testcase shows, when a PHI node dominates the loop there is no new
definition inside the loop. As such there would be no PHI nodes to update.
When we maintain LCSSA form we create an intermediate node in between the two
loops to thread alongt the value. However later on when we u
Hi All,
This fixes a -Wpedantic error with the testcase because of extra ; left after
the
functions.
Regtested single test on aarch64-none-elf and no issues.
Committed under the GCC obvious rule.
Thanks,
Tamar
gcc/testsuite/
2018-07-06 Tamar Christina
* gcc.target/aarch64
Hi All,
This fixes an ABI warning generated on i686-pc-linux-gnu when using
`vector_size` with no sse enabled explicitly.
Regtested single test on x86_64-pc-linux-gnu with -m32 and no issues.
Committed under the GCC obvious rule.
Thanks,
Tamar
gcc/testsuite/
2018-07-06 Tamar Christina
-pedantic-errors"
>
> So looks like a botched test run then. My bad..
>
> Ramana
>
>
> >
> >> Ramana
> >> >
> >> > Thanks,
> >> > Tamar
> >> >
> >> > gcc/testsuite/
> >> > 2018-07-06 Tamar Christina
> >> >
> >> > * gcc.target/aarch64/struct_cpy.c: Remove ;.
> >> >
> >> > --
Hi Christoph,
> 90d62e699bce9594879be2e3016c9b36c7e064c8..703632240822e762a90657096
> > >> 4b949c783df56f3 100644
> > >> --- a/gcc/config/arm/arm.c
> > >> +++ b/gcc/config/arm/arm.c
> > >> @@ -31508,8 +31508,8 @@ arm_can_change_mode_class
> (machine_mode from, machine_mode to,
> > >>{
> > >>
Hi Jakub,
>
> On Fri, Jul 06, 2018 at 11:46:43AM +0100, Tamar Christina wrote:
> > This fixes an ABI warning generated on i686-pc-linux-gnu when using
> > `vector_size` with no sse enabled explicitly.
> >
> > Regtested single test on x86_64-pc-linux-gnu with -m32 an
Hi All,
The patch series will allow AArch64 to use 64k guard sizes correctly and
improves the code quality.
It also enables a reduction of the overhead in code size over the current GCC 8
implementation.
Using 64k guard sizes results in a reduction in overhead compared to the 4k
guard size.
Th
on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no
issues.
Both targets were tested with stack clash on and off by default.
Ok for trunk?
Thanks,
Tamar
gcc/testsuite/
2018-07-11 Tamar Christina
PR target/86486
gcc.dg/pr82788.c: Skip for AArch64.
gcc.dg/guality/vla-
.
Target was tested with stack clash on and off by default.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-11 Jeff Law
Richard Sandiford
Tamar Christina
PR target/86486
* config/aarch64/aarch64.md (cmp,
probe_stack_range): Add k (SP) constraint
the user at configure
time.
Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no
issues.
Both targets were tested with stack clash on and off by default.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-11 Tamar Christina
PR target/86486
* configure.ac: Add
r needed.
Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no
issues.
Both targets were tested with stack clash on and off by default.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-11 Tamar Christina
PR target/86486
* explow.c (anti_adjust_stack_and_probe_sta
ithin the valid range.
Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no
issues.
Both targets were tested with stack clash on and off by default.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-11 Tamar Christina
* params.h (struct param_info): Add config
.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Target was tested with stack clash on and off by default.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-11 Tamar Christina
* common/config/aarch64/aarch64-common.c (TARGET_OPTION_DEFAULT_PARAM
apped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no
issues.
Both targets were tested with stack clash on and off by default.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-11 Tamar Christina
* params.c (set_param_value):
Add index of parameter being vali
for trunk?
Thanks,
Tamar
gcc/
2018-07-11 Tamar Christina
PR target/86486
* config/aarch64/aarch64.h (STACK_CLASH_OUTGOING_ARGS,
STACK_DYNAMIC_OFFSET): New.
* config/aarch64/aarch64.c (aarch64_layout_frame):
Update outgoing
size is only 4KB or
64KB and also enforces that for aarch64 the stack-clash probing interval is
equal to the guard size.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Target was tested with stack clash on and off by default.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-11 Tamar
Hi Jeff,
Thanks for the review!
The 07/11/2018 18:30, Jeff Law wrote:
> On 07/11/2018 05:20 AM, Tamar Christina wrote:
> > Hi All,
> >
> > This patch implements the use of the stack clash mitigation for aarch64.
> > In Aarch64 we expect both the probing interval an
Hi Jeff,
The 07/11/2018 20:21, Jeff Law wrote:
> On 07/11/2018 05:22 AM, Tamar Christina wrote:
> > Hi All,
> >
> > This patch defines a configure option to allow the setting of the default
> > guard size via configure flags when building the target.
> >
> &
Hi All,
I'm sending an updated patch which updates a testcase that hits one of our
corner cases.
This is an assembler scan only update in a testcase.
Regards,
Tamar
> -Original Message-
> From: Tamar Christina
> Sent: Wednesday, July 11, 2018 12:21
> To: gcc-patches
Hi All,
I am sending an updated patch which takes into account a
case where the set parameter value would not be safe to call.
No change in the cover letter.
Regards,
Tamar
> -Original Message-
> From: Tamar Christina
> Sent: Wednesday, July 11, 2018 12:25
> To:
Hi All,
I'm sending an updated patch which makes sure unwind tables are disabled always
for tests that do sequence checks so they pass in all configurations.
There's no change to the cover letter or implementation.
Regards,
Tamar
> -Original Message-
> From: Tamar C
The 07/13/2018 17:46, Jeff Law wrote:
> On 07/12/2018 11:39 AM, Tamar Christina wrote:
> >>> +
> >>> + /* Round size to the nearest multiple of guard_size, and calculate the
> >>> + residual as the difference between the origina
Hi Jeff,
> -Original Message-
> From: Tamar Christina
> Sent: Thursday, July 12, 2018 18:45
> To: Jeff Law
> Cc: gcc-patches@gcc.gnu.org; nd ;
> jos...@codesourcery.com; bonz...@gnu.org; d...@redhat.com;
> nero...@gcc.gnu.org; aol...@redhat.com; ralf.wildenh...
Hi Jeff,
> -Original Message-
> From: Jeff Law
> Sent: Thursday, July 19, 2018 18:32
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ;
> jos...@codesourcery.com; bonz...@gnu.org; d...@redhat.com;
> nero...@gcc.gnu.org; aol...@redhat.com; ralf.wildenh...
>
> On 07/20/2018 05:03 AM, Tamar Christina wrote:
> >> Understood. Thanks for verifying. I wonder if we could just bury
> >> this entirely in the aarch64 config files and not expose the default into
> params.def?
> >>
> >
> > Burying it in con
can not handle this and simply returns ""
and asserts.
This pattern is essentially dead and I'm removing it for clarity.
Regtested on armeb-none-eabi and no regressions.
Bootstrapped on arm-none-linux-gnueabihf and no issues.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-23
cution test
Regtested on armeb-none-eabi and no regressions.
Bootstrapped on arm-none-linux-gnueabihf and no issues.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-23 Tamar Christina
PR target/84711
* config/arm/arm.c (arm_can_change_mode_class): Disallow subreg.
* conf
and found no issues.
OK for trunk?
Thanks,
Tamar
gcc/
2018-07-23 Tamar Christina
* expr.c (copy_blkmode_to_reg): Perform larger copies when safe.
--
diff --git a/gcc/expr.c b/gcc/expr.c
index f665e187ebbbc7874ec88e84ca47ed991491c3e5..17b580aabf761491d8003ac74daa014bc252ea9f 100644
explicitly is they want to support this configure flag and values that
users may have set.
Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no
issues.
Both targets were tested with stack clash on and off by default.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-24 Tamar Christina
tested with stack clash on and off by default.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-24 Tamar Christina
PR target/86486
* config/aarch64/aarch64.c (aarch64_override_options_internal):
Add validation for stack-clash parameters and set defaults.
> -Original Mess
on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no
issues.
Both targets were tested with stack clash on and off by default.
Ok for trunk?
Thanks,
Tamar
gcc/testsuite/
2018-07-24 Tamar Christina
PR target/86486
* gcc.dg/pr82788.c: Skip for AArch64.
* gcc.dg/guality/vla-
still keep the parameters within
the valid range.
Bootstrapped Regtested on aarch64-none-linux-gnu, x86_64-pc-linux-gnu and no
issues.
Both targets were tested with stack clash on and off by default.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-24 Tamar Christina
* params.c
Hi All,
Here's an updated patch with documentation.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-24 Tamar Christina
PR target/86486
* configure.ac: Add stack-clash-protection-guard-size.
* doc/install.texi: Document it.
* config.in (DEFAULT_STK_CLASH_GUARD
Hi Richard,
Thanks for the review!
The 07/23/2018 18:46, Richard Biener wrote:
> On July 23, 2018 7:01:23 PM GMT+02:00, Tamar Christina
> wrote:
> >Hi All,
> >
> >This allows copy_blkmode_to_reg to perform larger copies when it is
> >safe to do so by calculatin
Hi All,
Attached is an updated patch that clarifies some of the comments in the patch
and adds comments to the individual testcases
as requested.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-25 Jeff Law
Richard Sandiford
Tamar Christina
PR target/86486
HI Alexandre,
Thanks for the review. Attached is the updated patch and new changelog below:
Thanks,
Tamar
gcc/
2018-07-25 Tamar Christina
PR target/86486
* configure.ac: Add stack-clash-protection-guard-size.
* doc/install.texi: Document it.
* config.in
Hi All,
Attached an updated patch which documents what the test cases are expecting as
requested.
Ok for trunk?
Thanks,
Tamar
gcc/
2018-07-25 Tamar Christina
PR target/86486
* config/aarch64/aarch64.h (STACK_CLASH_OUTGOING_ARGS,
STACK_DYNAMIC_OFFSET): New
t; Regtested on armeb-none-eabi and no regressions.
> > Bootstrapped on arm-none-linux-gnueabihf and no issues.
> >
> >
> > Ok for trunk?
> >
> > Thanks,
> > Tamar
> >
> > gcc/
> > 2018-07-23 Tamar Christina
> >
> > PR target/84711
> > * config/arm/arm.c (arm_can_change_mode_class): Disallow subreg.
> > * config/arm/neon.md (movv4hf, movv8hf): Refactored to..
> > (mov): ..this and enable unconditionally.
> >
> > --
Hi Thomas,
> -Original Message-
> From: Thomas Preudhomme
> Sent: Thursday, July 26, 2018 09:29
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Ramana Radhakrishnan
> ; Richard Earnshaw
> ; ni...@redhat.com; Kyrylo Tkachov
>
> Subject: Re: [PATCH]
Hi Alexandre,
> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org
> On Behalf Of Alexandre Oliva
> Sent: Thursday, July 26, 2018 08:46
> To: Tamar Christina
> Cc: Joseph Myers ; Jeff Law
> ; gcc-patches@gcc.gnu.org; nd ;
> bonz...@gnu.org; d...@redhat.
Ping 😊
> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org
> On Behalf Of Tamar Christina
> Sent: Monday, July 23, 2018 17:52
> To: gcc-patches@gcc.gnu.org
> Cc: nd ; Ramana Radhakrishnan
> ; Richard Earnshaw
> ; ni...@redhat.com; Kyrylo Tkachov
>
Ping 😊
> -Original Message-
> From: Thomas Preudhomme
> Sent: Thursday, July 26, 2018 12:29
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; Ramana Radhakrishnan
> ; Richard Earnshaw
> ; ni...@redhat.com; Kyrylo Tkachov
>
> Subject: Re: [PATCH][GC
Ping 😊
> -Original Message-
> From: gcc-patches-ow...@gcc.gnu.org
> On Behalf Of Tamar Christina
> Sent: Tuesday, July 24, 2018 17:34
> To: Richard Biener
> Cc: gcc-patches@gcc.gnu.org; nd ; l...@redhat.com;
> i...@airs.com; amo...@gmail.com; berg...@vnet.ibm.com
Hi Richard,
The 07/31/2018 11:21, Richard Biener wrote:
> On Tue, 31 Jul 2018, Tamar Christina wrote:
>
> > Ping 😊
> >
> > > -Original Message-
> > > From: gcc-patches-ow...@gcc.gnu.org
> > > On Behalf Of Tamar Christina
> > >
Hi All,
During the refactoring I had passed loop_vinfo on to vect_set_loop_condition
during prolog peeling. This parameter is unused in most cases except for in
vect_set_loop_condition_partial_vectors where it's behaviour depends on whether
loop_vinfo is NULL or not. Apparently this code expect
Hi All,
The previous patch tried to remove PHI nodes that dominated the first loop,
however the correct fix is to only remove .MEM nodes.
This patch thus makes the condition a bit stricter and only tries to remove
MEM phi nodes.
I couldn't figure out a way to easily determine if a particular PHI
> -Original Message-
> From: Richard Biener
> Sent: Friday, July 14, 2023 2:35 PM
> To: Tamar Christina
> Cc: gcc-patches@gcc.gnu.org; nd ; j...@ventanamicro.com
> Subject: RE: [PATCH 12/19]middle-end: implement loop peeling and IV
> updates for early break.
>
Hi All,
This patch adds initial support for early break vectorization in GCC.
The support is added for any target that implements a vector cbranch optab,
this includes both fully masked and non-masked targets.
Depending on the operation, the vectorizer may also require support for boolean
mask re
Hi All,
This adds pragma GCC novector to testcases that have showed up
since last regression run and due to this series detecting more.
Is it ok that when it comes time to commit I can just update any
new cases before committing? since this seems a cat and mouse game..
Bootstrapped Regtested on
Hi All,
When performing early break vectorization we need to be sure that the vector
operations are safe to perform. A simple example is e.g.
for (int i = 0; i < N; i++)
{
vect_b[i] = x + i;
if (vect_a[i]*2 != x)
break;
vect_a[i] = x;
}
where the store to vect_b is not allowed
Hi All,
This splits the part of the function that does peeling for loops at exits to
a different function. In this new function we also peel for early breaks.
Peeling for early breaks works by redirecting all early break exits to a
single "early break" block and combine them and the normal exit
Hi All,
This has loop versioning use the vectorizer's IV exit edge when it's available
since single_exit (..) fails with multiple exits.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* tree-vect-loop-manip.cc (vect_loop_ver
Hi All,
As requested, the vectorizer is now free to pick it's own exit which can be
different than what the loop CFG infrastucture uses. The vectorizer makes use
of this to vectorize loops that it previously could not.
But this means that loop control must be materialized in the block that needs
Hi All,
This changes the PHI node updates to support early breaks.
It has to support both the case where the loop's exit matches the normal loop
exit and one where the early exit is "inverted", i.e. it's an early exit edge.
In the latter case we must always restart the loop for VF iterations. Fo
Hi All,
This adds support to vectorizable_live_reduction to handle multiple exits by
doing a search for which exit the live value should be materialized in.
Additinally which value in the index we're after depends on whether the exit
it's materialized in is an early exit or whether the loop's mai
Hi All,
This updates relevancy analysis to support marking gcond's belonging to early
breaks as relevant for vectorization.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* tree-vect-stmts.cc (vect_stmt_relevant_p,
v
Hi All,
This implements vectorable_early_exit which is used as the codegen part of
vectorizing a gcond.
For the most part it shares the majority of the code with
vectorizable_comparison with addition that it needs to be able to reduce
multiple resulting statements into a single one for use in the
Hi All,
This wires through the final bits to support adding the guard block between
the loop and epilog.
For an "inverted loop", i.e. one where an early exit was chosen as the main
exit then we can never skip the scalar loop since we know we have side effects
to still perform. For those cases we
Hi All,
This finishes wiring that didn't fit in any of the other patches.
Essentially just adding related changes so peeling for early break works.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* tree-vect-loop-manip.cc (ve
Hi All,
This sets LOOP_VINFO_EARLY_BREAKS and does some misc changes so the other
patches are self contained.
Bootstrapped Regtested on aarch64-none-linux-gnu and no issues.
Ok for master?
Thanks,
Tamar
gcc/ChangeLog:
* tree-vect-loop.cc (vect_analyze_loop_form): Analyse all exits.
Hi All,
The vectorizer at the moment uses a num_bb check to check for control flow.
This rejects a number of loops with no reason. Instead this patch changes it
to check the destination of the exits instead.
This also allows early break to work by also dropping the single_exit check.
Bootstrapp
Hi All,
What do people think about having the ability to force only the latch connected
exit as the exit as a param? I.e. what's in the patch but as a param.
I found this useful when debugging large example failures as it tells me where
I should be looking. No hard requirement but just figured I
Hi All,
I didn't want these to get lost in the noise of updates.
The following three tests now correctly work for targets that have an
implementation of cbranch for vectors so XFAILs are conditionally removed gated
on vect_early_break support.
Bootstrapped Regtested on aarch64-none-linux-gnu and
Hi All,
This adds an implementation for conditional branch optab for AArch64.
For e.g.
void f1 ()
{
for (int i = 0; i < N; i++)
{
b[i] += a[i];
if (a[i] > 0)
break;
}
}
For 128-bit vectors we generate:
cmgtv1.4s, v1.4s, #0
umaxp v1.4s, v1.4s,
Hi All,
Advanced SIMD lacks a cmpeq for vectors, and unlike compare to 0 we can't
rewrite to a cmtst.
This operation is however fairly common, especially now that we support early
break vectorization.
As such this adds a pattern to recognize the negated any comparison and
transform it to an all.
Hi All,
This adds an implementation for conditional branch optab for MVE.
Unfortunately MVE has rather limited operations on VPT.P0, we are missing the
ability to do P0 comparisons and logical OR on P0.
For that reason we can only support cbranch with 0, as for comparing to a 0
predicate we don'
Hi All,
Advanced SIMD lacks flag setting vector comparisons which SVE adds. Since
machines
with SVE also support Advanced SIMD we can use the SVE comparisons to perform
the
operation in cases where SVE codegen is allowed, but the vectorizer has decided
to generate Advanced SIMD because of loop
Hi All,
This adds an implementation for conditional branch optab for AArch32.
For e.g.
void f1 ()
{
for (int i = 0; i < N; i++)
{
b[i] += a[i];
if (a[i] > 0)
break;
}
}
For 128-bit vectors we generate:
vcgt.s32q8, q9, #0
vpmax.u32 d7,
Hi All,
various optimizations in match.pd only happened on COPYSIGN in lock step
which means they exclude IFN_COPYSIGN. COPYSIGN however is restricted to only
the C99 builtins and so doesn't work for vectors.
The patch expands these optimizations to work as nested iters.
This is needed for the
Hi All,
This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more
canonical and allows a target to expand this sequence efficiently. Such
sequences are common in scientific code working with gradients.
There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x))
w
1 - 100 of 1863 matches
Mail list logo