On Linux/x86_64,
f9ca3fd1fe30f3ee6725bfe4a612e9a1234c11ac is the first bad commit
commit f9ca3fd1fe30f3ee6725bfe4a612e9a1234c11ac
Author: Levy Hsu
Date: Mon Sep 2 13:52:38 2024 +0800
i386: Support partial vectorized FMA for V2BF/V4BF
caused
FAIL: gcc.target/i386/avx10_2-partial-bf-vector
When optimize_memcpy was added in r7-5443-g7b45d0dfeb5f85,
a path was added such that a statement was turned into a non-throwing
statement and maybe_clean_or_replace_eh_stmt/gimple_purge_dead_eh_edges
would not be called for that statement.
This adds these calls to that path.
Bootstrapped and test
On Wed, Sep 4, 2024 at 2:44 PM Andrew Pinski wrote:
>
> On Wed, Sep 4, 2024 at 2:36 PM Marek Polacek wrote:
> >
> > On Wed, Sep 04, 2024 at 02:05:21PM -0700, Andrew Pinski wrote:
> > > The code in aarch64_lookup_shared_state_flags all C++11 attributes on the
> > > function type
> > > had a names
The code in aarch64_lookup_shared_state_flags all C++11 attributes on the
function type
had a namespace associated with them. But with the addition of
reproducible/unsequenced,
this is not true.
This fixes the issue by using is_attribute_namespace_p instead of manually
figuring out
the namespac
On 9/4/24 4:07 PM, Palmer Dabbelt wrote:
These tests were checking that the output of the setCC instruction was bit
flipped, but it looks like they're really designed to test that
redundant sign extension elimination fires on conditionals from function
inputs. Jeff just posed a patch to clean
On 8/20/24 2:42 AM, Richard Sandiford wrote:
Vineet Gupta writes:
On 8/19/24 14:52, Richard Sandiford wrote:
2. On RISC-V sched1 is counter intuitively assuming HARD_FP is live due to the
weird interaction of DF infra (which always marks HARD_FP with
artificial def) and ira_no_alloc_regs.
On Wed, 04 Sep 2024 19:24:41 PDT (-0700), Kito Cheng wrote:
Just remember adding a system wide vector calling convention has wide
compatible issues we need to worry about, like jump buf (for
setjmp/longjmp) will need to keep vector status, it doesn't need to
keep before since all vectors are call
Just remember adding a system wide vector calling convention has wide
compatible issues we need to worry about, like jump buf (for
setjmp/longjmp) will need to keep vector status, it doesn't need to
keep before since all vectors are call-clobber by default.
Also that may cause performance issue fo
> This won't apply as I've already updated those tests. I think verifying
> the number of SAT_ADDs is useful to ensure we don't regress as some of
> these tests detect > 1 SAT_ADD idiom.
I see, thanks Jeff. Then drop this patch.
Pan
-Original Message-
From: Jeff Law
Sent: Thursday,
On 9/4/24 8:01 PM, pan2...@intel.com wrote:
From: Pan Li
Some middl-end change may effect on the times of .SAT_*. Thus,
refine the dump check for SAT_*, from the scan-times to scan as
we only care about the .SAT_* exist or not. And there will an
other PATCH to perform similar refinement an
Hi all,
In avx512f-mask-type.h, we need SIZE being defined to get
MASK_TYPE defined correctly. Fix those testcases where
SIZE are not defined before the include for avv512f-mask-type.h.
Note that for convert intrins in AVX10.2, they will need more
modifications due to the current tests did not in
From: Pan Li
Some middl-end change may effect on the times of .SAT_*. Thus,
refine the dump check for SAT_*, from the scan-times to scan as
we only care about the .SAT_* exist or not. And there will an
other PATCH to perform similar refinement and this PATCH only
fix the failed test cases.
gcc
On Wed, Sep 4, 2024 at 9:32 AM Levy Hsu wrote:
>
> Hi
>
> This change adds BFmode support to the ix86_preferred_simd_mode function
> enhancing SIMD vectorization for BF16 operations. The update ensures
> optimized usage of SIMD capabilities improving performance and aligning
> vector sizes with pr
On Wed, Sep 4, 2024 at 10:53 AM Levy Hsu wrote:
>
> Hi
>
> This patch adds support for bf16 operations in V2BF and V4BF modes on i386,
> handling signbit, xorsign, copysign, abs, neg, and various logical operations.
>
> Bootstrapped and tested on x86-64-pc-linux-gnu.
> Ok for trunk?
Ok.
>
> gcc/Ch
On Wed, Sep 4, 2024 at 11:31 AM Levy Hsu wrote:
>
> Hi
>
> Bootstrapped and tested on x86-64-pc-linux-gnu.
> Ok for trunk?
Ok.
>
> This patch introduces support for vectorized FMA operations for bf16 types in
> V2BF and V4BF modes on the i386 architecture. New mode iterators and
> define_expand en
*_eq3_1 supports
nonimm_or_0_operand for op1 and op2, pass_combine would fail to lower
avx512 comparision back to avx2 one when op1/op2 is const0_rtx. It's
because the splitter only support nonimmediate_operand.
Failed to match this instruction:
(set (reg/i:V16QI 20 xmm0)
(vec_merge:V16QI (con
Thanks for the explanation.
> On 2 Sep 2024, at 9:47 am, Andrew Pinski wrote:
>
> External email: Use caution opening links or attachments
>
>
> On Sun, Sep 1, 2024 at 4:27 PM Kugan Vivekanandarajah
> wrote:
>>
>> Hi Andrew.
>>
>>> On 28 Aug 2024, at 2:23 pm, Andrew Pinski wrote:
>>>
>>> Exter
On 9/2/24 2:01 PM, Raphael Moreira Zinsly wrote:
Improve handling of constants where the high half can be constructed by
inverting the lower half.
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_build_integer): Detect constants
were the higher half is the lower half inverted.
g
On 9/2/24 2:01 PM, Raphael Moreira Zinsly wrote:
Improve handling of large constants in riscv_build_integer, generate
better code for constants where the high half can be constructed
by shifting/shiftNadding the low half or if the halves differ by less
than 2k.
gcc/ChangeLog:
* config
On 9/2/24 2:01 PM, Raphael Moreira Zinsly wrote:
Improve handling of constants where its upper and lower 32-bit
halves are the same and have negative values.
e.g. for:
unsigned long f (void) { return 0xf0f0f0f0f0f0f0f0UL; }
Without the patch:
li a0,-252645376
addia0,a0,240
li
Thanks Richard for comments.
> I also think we may want to split out this CFG matching code out into
> a helper function
> in gimple-match-head.cc instead of repeating it fully for each pattern?
That makes sense to me, let me have a try in v2.
Pan
-Original Message-
From: Richard Biener
On 9/4/24 1:13 AM, Torbjorn SVENSSON wrote:
On 2024-09-03 20:23, Richard Biener wrote:
Am 03.09.2024 um 19:00 schrieb Tamar Christina
:
Hi All,
The meaning of the testcase was changed by passing it -fwrapv. The
reason for
the test failures on some platform was because the test was
On 9/4/24 2:26 PM, Palmer Dabbelt wrote:
Now that we've got the riscv_vector_cc attribute it's pretty much free
to add a system-wide ABI -- at least in terms of implementation. So
this just adds a new ABI command-line value that defaults to enabling
the vector calling convention, essentially
On 9/4/24 4:11 PM, Palmer Dabbelt wrote:
On Wed, 04 Sep 2024 13:47:58 PDT (-0700), jeffreya...@gmail.com wrote:
So I was looking at a performance regression in spec with Ventana's
internal tree. Ultimately the problem was a bad interaction with an
internal patch (REP_MODE_EXTENDED), fwprop
On 9/4/24 4:07 PM, Palmer Dabbelt wrote:
These tests were checking that the output of the setCC instruction was bit
flipped, but it looks like they're really designed to test that
redundant sign extension elimination fires on conditionals from function
inputs. Jeff just posed a patch to clean
On Wed, 04 Sep 2024 13:47:58 PDT (-0700), jeffreya...@gmail.com wrote:
>
> So I was looking at a performance regression in spec with Ventana's
> internal tree. Ultimately the problem was a bad interaction with an
> internal patch (REP_MODE_EXTENDED), fwprop and ext-dce. The details of
> that prob
These tests were checking that the output of the setCC instruction was bit
flipped, but it looks like they're really designed to test that
redundant sign extension elimination fires on conditionals from function
inputs. Jeff just posed a patch to clean this code up with trips up on
the arbitrary x
On Wed, Sep 4, 2024 at 2:36 PM Marek Polacek wrote:
>
> On Wed, Sep 04, 2024 at 02:05:21PM -0700, Andrew Pinski wrote:
> > The code in aarch64_lookup_shared_state_flags all C++11 attributes on the
> > function type
> > had a namespace associated with them. But with the addition of
> > reproducib
On Wed, Sep 04, 2024 at 02:05:21PM -0700, Andrew Pinski wrote:
> The code in aarch64_lookup_shared_state_flags all C++11 attributes on the
> function type
> had a namespace associated with them. But with the addition of
> reproducible/unsequenced,
> this was no longer true.
> This is the simple f
On Wed, 04 Sep 2024 13:26:11 PDT (-0700), Palmer Dabbelt wrote:
Now that we've got the riscv_vector_cc attribute it's pretty much free
to add a system-wide ABI -- at least in terms of implementation. So
this just adds a new ABI command-line value that defaults to enabling
the vector calling conv
I'm gently pinging about the patch I submitted:
https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660177.html
This patch was created in response to Jason's comments here:
https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657739.html
I appreciate your time and consideration.
Thank you.
The code in aarch64_lookup_shared_state_flags all C++11 attributes on the
function type
had a namespace associated with them. But with the addition of
reproducible/unsequenced,
this was no longer true.
This is the simple fix to ignore attributes in the global namespace since we
are looking
for o
So I was looking at a performance regression in spec with Ventana's
internal tree. Ultimately the problem was a bad interaction with an
internal patch (REP_MODE_EXTENDED), fwprop and ext-dce. The details of
that problem aren't particularly important.
Removal of the local patch went reason
On Sep 3, 2024, at 11:44 PM, Alexandre Oliva wrote:
>
> On Nov 9, 2023, Mike Stump wrote:
>
>> On Nov 8, 2023, at 8:29 AM, Alexandre Oliva wrote:
>>>
>>> On Nov 5, 2023, Mike Stump wrote:
>>>
that, otherwise, I'll approve this version.
>>>
>>> FWIW, this version is not usable as is.
Tested x86_64-linux. Pushed to gcc-14.
-- >8 --
For the backport, rejecting array types is only done in strict modes.
libstdc++-v3/ChangeLog:
PR libstdc++/116381
* include/std/variant (variant): Fix conditions for
static_assert to match the spec.
* testsuite/20_u
Am 2024-09-04 um 19:12 schrieb Jakub Jelinek:
On Wed, Sep 04, 2024 at 12:34:04PM -0400, Jason Merrill wrote:
So, one possibility would be to call save_expr unconditionally in
get_member_function_from_ptrfunc as well.
Or build a TARGET_EXPR (force_target_expr or similar).
Yes. I don't have a
Now that we've got the riscv_vector_cc attribute it's pretty much free
to add a system-wide ABI -- at least in terms of implementation. So
this just adds a new ABI command-line value that defaults to enabling
the vector calling convention, essentially the same as scattering the
attribute on every
Evening,
Arsen Arsenović writes:
> [[PGP Signed Part:Good signature from 52C294301EA2C493 Arsen Arsenović
> (trust ultimate) created at 2024-08-28T23:00:44+0200
> using EDDSA]]
> Hi,
>
> Arsen Arsenović writes:
>
>>> The && should not be left of the =; if the initializer needs to span
>>> m
Hi,
I'm writing to ask that someone with write access to the git repo apply
this patch, which provides the macro definition
`_MM_FROUND_TO_NEAREST_TIES_EVEN`.
Intrinsics such as `_mm512_add_round_ps` take a rounding mode argument to
specify the floating point rounding mode. This and si
> On 4 Sep 2024, at 17:21, Jason Merrill wrote:
>
> On 9/1/24 12:17 PM, Iain Sandoe wrote:
>> This came up in discussion of an earlier patch.
>> I'm in two minds as to whether it's a good idea or not - the underlying
>> issue being that libubsan does not yet (AFAICT) have the concept of a
>> c
On Wed, Sep 04, 2024 at 10:58:25AM -0400, Jason Merrill wrote:
> On 9/3/24 6:12 PM, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14?
>
> The change to return bool seems like unrelated cleanup; please push that
> separately on trunk only.
Done.
> > + /
On Mon, Aug 19, 2024 at 03:52:58PM +0100, Andrew Carlotti wrote:
> On Fri, Aug 16, 2024 at 07:17:24AM +, Kyrylo Tkachov wrote:
> >
> >
> > > On 15 Aug 2024, at 18:48, Andrew Carlotti wrote:
> > >
> > > External email: Use caution opening links or attachments
> > >
> > >
> > > On Thu, Aug
The recent path splitting changes from Andrew result in identifying more
saturation idioms instead of just identifying an overflow check. As a
result many of the tests in the RISC-V port started failing a scan check
on the .expand output.
As expected, identifying a saturation idiom is more
On Wed, Sep 04, 2024 at 01:22:47PM -0400, Jason Merrill wrote:
> > @@ -8985,6 +9003,13 @@ cp_finish_decl (tree decl, tree init, bo
> > if (var_definition_p)
> > abstract_virtuals_error (decl, type);
> > + if (decomp && !processing_template_decl)
> > + {
> > + need_decomp_init
Split out from
https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662261.html
which was tested on x86_64-pc-linux-gnu. I'm checking this in.
-- >8 --
This function could use some sprucing up.
gcc/cp/ChangeLog:
* pt.cc (coerce_template_template_parm): Return bool instead of int.
--
On 8/30/24 1:37 PM, Jakub Jelinek wrote:
On Wed, Aug 21, 2024 at 02:08:16PM -0400, Jason Merrill wrote:
I was concerned about the use of a single boolean to guard the destruction
of multiple objects, suspecting that it would break in obscure EH cases.
When I finally managed to construct a testca
On Wed, Sep 04, 2024 at 12:28:49PM -0400, Jason Merrill wrote:
> On 8/30/24 3:40 PM, Marek Polacek wrote:
> > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
> >
> > -- >8 --
> > Redeclaration such as
> >
> >void f(void);
> >consteval void f(void);
> >
> > is invalid. In a
On Wed, Sep 04, 2024 at 12:34:04PM -0400, Jason Merrill wrote:
> > So, one possibility would be to call save_expr unconditionally in
> > get_member_function_from_ptrfunc as well.
> >
> > Or build a TARGET_EXPR (force_target_expr or similar).
>
> Yes. I don't have a strong preference between the
On 9/4/24 8:08 AM, Xi Ruoyao wrote:
Hi Jeff,
On Mon, 2024-09-02 at 12:53 -0600, Jeff Law wrote:
(define_insn_and_split "_shift_reverse"
[(set (match_operand:X 0 "register_operand" "=r")
(any_bitwise:X (ashift:X (match_operand:X 1 "register_operand" "r")
@@ -2934,9 +2936,9 @@ (def
On 9/4/24 11:15 AM, Jakub Jelinek wrote:
On Wed, Sep 04, 2024 at 11:06:22AM -0400, Jason Merrill wrote:
On 9/2/24 1:49 PM, Jakub Jelinek wrote:
Hi!
The following testcase is miscompiled, because
get_member_function_from_ptrfunc
emits something like
(((FUNCTION.__pfn & 1) != 0)
? ptr + FUNCT
Hello,
The attached patch implements P2592, adding std::hash specializations
for std::chrono classes.
One aspect I'm quite unhappy with is the hash combiner I've used. I'm
not sure if there's some longer-term goal for libstdc++ here -- would
you prefer to roll something à la Boost.HashCombin
On Wed, 4 Sep 2024, Evgeny Karpov wrote:
Monday, September 4, 2024
Martin Storsjö wrote:
compilation time
adrp x0, symbol + 256
9000 adrp x0, 0
As the symbol offset is 256, you will need to encode the offset "256" in
the instruction immediate field. Not "256 >> 12". This is the somewhat
On 8/30/24 3:40 PM, Marek Polacek wrote:
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?
-- >8 --
Redeclaration such as
void f(void);
consteval void f(void);
is invalid. In a namespace scope, we detect the collision in
validate_constexpr_redeclaration, but not when one decl
On 8/31/24 12:37 PM, Iain Sandoe wrote:
tested on x86_64-darwin/linux powerpc64le-linux,
OK for trunk? alternate suggestions?
thanks,
Iain
--- 8< ---
In examining the coroutine testcases for unexpected diagnostic output
for 'Wall', I found a 'statement has no effect' warning for the promise
con
On 9/1/24 12:17 PM, Iain Sandoe wrote:
This came up in discussion of an earlier patch.
I'm in two minds as to whether it's a good idea or not - the underlying
issue being that libubsan does not yet (AFAICT) have the concept of a
coroutine, so that the diagnostics are not very specific and might
On 9/1/24 2:51 PM, Simon Martin wrote:
Hi Jason,
On 26 Aug 2024, at 19:23, Jason Merrill wrote:
On 8/25/24 12:37 PM, Simon Martin wrote:
On 24 Aug 2024, at 23:59, Simon Martin wrote:
On 24 Aug 2024, at 15:13, Jason Merrill wrote:
On 8/23/24 12:44 PM, Simon Martin wrote:
We currently emit
On Wed, Sep 4, 2024 at 8:18 AM Jason Merrill wrote:
>
> Tested x86_64-pc-linux-gnu. Any objections?
>
> -- 8< --
>
> Several PRs complain about -Wswitch warning about a case for a bitwise
> combination of enumerators. Clang has an attribute flag_enum to prevent
> this; let's adopt that approach
On Wed, 4 Sep 2024, Martin Storsjö wrote:
On Wed, 4 Sep 2024, Evgeny Karpov wrote:
Monday, September 4, 2024
Martin Storsjö wrote:
Let's consider the following example, when symbol is located at 3072.
1. Example without the fix
compilation time
adrp x0, (3072 + 256) & ~0xFFF // x0 =
On Wed, Sep 04, 2024 at 11:06:22AM -0400, Jason Merrill wrote:
> On 9/2/24 1:49 PM, Jakub Jelinek wrote:
> > Hi!
> >
> > The following testcase is miscompiled, because
> > get_member_function_from_ptrfunc
> > emits something like
> > (((FUNCTION.__pfn & 1) != 0)
> > ? ptr + FUNCTION.__delta + FU
On Fri, 2024-06-28 at 15:06 +0200, Thomas Schwinge wrote:
> Hi!
>
> As part of this:
>
> On 2013-07-26T11:04:33-0400, David Malcolm
> wrote:
> > This patch is the hand-written part of the conversion of passes
> > from
> > C structs to C++ classes.
>
> > --- a/gcc/passes.c
> > +++ b/gcc/passes.c
On 9/2/24 7:43 AM, Nathaniel Shead wrote:
Ping for https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659796.html
OK.
For clarity's sake, here's the full patch with the adjustment I
mentioned earlier:
-- >8 --
This patch goes through all .cc files in gcc/cp and adds in any
auto_diagnosti
On Wed, 4 Sep 2024, Evgeny Karpov wrote:
Monday, September 4, 2024
Martin Storsjö wrote:
Let's consider the following example, when symbol is located at 3072.
1. Example without the fix
compilation time
adrp x0, (3072 + 256) & ~0xFFF // x0 = 0
add x0, x0, (3072 + 256) & 0xFFF
On 9/2/24 1:49 PM, Jakub Jelinek wrote:
Hi!
The following testcase is miscompiled, because
get_member_function_from_ptrfunc
emits something like
(((FUNCTION.__pfn & 1) != 0)
? ptr + FUNCTION.__delta + FUNCTION.__pfn - 1
: FUNCTION.__pfn) (ptr + FUNCTION.__delta, ...)
or so, so FUNCTION tree
On Wed, Sep 04, 2024 at 08:15:25AM -0400, Jason Merrill wrote:
> Tested x86_64-pc-linux-gnu. Any objections?
Looks good except...
> +/* Attributes also recognized in the clang:: namespace. */
> +const struct attribute_spec c_common_clang_attributes[] = {
> + { "flag_enum", 0, 0, fal
Pushed as obvious.
-- >8 --
Fixed by r15-2540-g32e678b2ed7521. Add a testcase, as the original ones
do not cover this particular failure mode.
gcc/testsuite/ChangeLog:
PR c++/108620
* g++.dg/coroutines/pr108620.C: New test.
---
gcc/testsuite/g++.dg/coroutines/pr1
On 9/3/24 2:47 PM, Marek Polacek wrote:
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14?
OK.
-- >8 --
We ICE in nothrow_spec_p because it got a DEFERRED_NOEXCEPT.
This DEFERRED_NOEXCEPT was created in implicitly_declare_fn
when declaring
Foo& operator=(Foo&&) = default;
in
On 9/3/24 6:12 PM, Marek Polacek wrote:
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14?
The change to return bool seems like unrelated cleanup; please push that
separately on trunk only.
+ /* We can also have:
+
+ template typename X>
+ void
On 9/4/24 12:55, Jan Hubicka wrote:
On 9/3/24 15:07, Jan Hubicka wrote:
Hi,
We disable gathers for zen4. It seems that gather has improved a bit compared
to zen4 and Zen5 optimization manual suggests "Avoid GATHER instructions when
the indices are known ahead of time. Vector loads followed by
On Wed, 04 Sep 2024 04:10:52 PDT (-0700), rguent...@suse.de wrote:
The following adds SLP discovery for roots that are only live but
otherwise unused. These are usually inductions. This allows a
few more testcases to be handled fully with SLP, for example
gcc.dg/vect/no-scevccp-pr86725-1.c
Boo
Monday, September 4, 2024
Martin Storsjö wrote:
>> Let's consider the following example, when symbol is located at 3072.
>>
>> 1. Example without the fix
>> compilation time
>> adrp x0, (3072 + 256) & ~0xFFF // x0 = 0
>> add x0, x0, (3072 + 256) & 0xFFF // x0 = 3328
>>
>> linking t
The following enables single-lane loop SLP discovery for non-grouped stores
and adjusts vectorizable_store to properly handle those.
For gfortran.dg/vect/vect-8.f90 we vectorize one additional loop,
not running into the "not falling back to strided accesses" bail-out.
I have not investigated in
Hi Jeff,
On Mon, 2024-09-02 at 12:53 -0600, Jeff Law wrote:
> (define_insn_and_split "_shift_reverse"
> [(set (match_operand:X 0 "register_operand" "=r")
> (any_bitwise:X (ashift:X (match_operand:X 1 "register_operand" "r")
> @@ -2934,9 +2936,9 @@ (define_insn_and_split "_shift_reverse"
>
r14-9122-g67a29f99cc8138 disabled scheduling on a lot of testcases
for RISC-V for PR113249 but using dg-options. This makes
gfortran.dg/vect/vect-8.f90 UNRESOLVED as it relies on default
flags to enable vectorization.
The following uses dg-additional-options instead.
Tested on riscv64-linux with
>From 0130d3cb01fd9d5c1c997003245ed57bbdeb00a2 Mon Sep 17 00:00:00 2001
From: Aleksandar
Date: Fri, 23 Aug 2024 11:36:50 +0200
Subject: [PATCH] [Bug tree-optimization/109429] ivopts: fixed complexities
This patch addresses a bug introduced in commit f9f69dd by
correcting the complexity calculatio
Implement vsbcq vsbciq using the new MVE builtins framework.
We re-use most of the code introduced by the previous patches.
2024-08-28 Christophe Lyon
gcc/
* config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): Add
support for vsbciq and vsbcq.
(vadciq,
Factorize vadc/vsbc and vadci/vsbci so that they use the same
parameterized names.
2024-08-28 Christophe Lyon
gcc/
* config/arm/iterators.md (mve_insn): Add VADCIQ_M_S, VADCIQ_M_U,
VADCIQ_U, VADCIQ_S, VADCQ_M_S, VADCQ_M_U, VADCQ_S, VADCQ_U,
VSBCIQ_M_S, VSBCIQ_M_
Implement vadcq using the new MVE builtins framework.
We re-use most of the code introduced by the previous patch to support
vadciq: we just need to initialize carry from the input parameter.
2024-08-28 Christophe Lyon
gcc/
* config/arm/arm-mve-builtins-base.cc (vadcq_vsbc):
Since we rewrote the implementation of vshlcq intrinsics, we no longer
need these expanders.
2024-08-28 Christophe Lyon
gcc/
* config/arm/arm-builtins.cc
(arm_ternop_unone_none_unone_imm_qualifiers)
(-arm_ternop_none_none_unone_imm_qualifiers): Delete.
*
Implement vshlc using the new MVE builtins framework.
2024-08-28 Christophe Lyon
gcc/
* config/arm/arm-mve-builtins-base.cc (class vshlc_impl): New.
(vshlc): New.
* config/arm/arm-mve-builtins-base.def (vshlcq): New.
* config/arm/arm-mve-builtins-base.h
Implement vdwdup and viwdup using the new MVE builtins framework.
In order to share more code with viddup_impl, the patch swaps operands
1 and 2 in @mve_v[id]wdupq_m_wb_u_insn, so that the parameter
order is similar to what @mve_v[id]dupq_m_wb_u_insn uses.
2024-08-28 Christophe Lyon
g
This patch adds the vshlc shape description.
2024-08-28 Christophe Lyon
gcc/
* config/arm/arm-mve-builtins-shapes.cc (vshlc): New.
* config/arm/arm-mve-builtins-shapes.h (vshlc): New.
---
gcc/config/arm/arm-mve-builtins-shapes.cc | 44 +++
gcc/confi
Testing v[id]wdup overloads with '1' as argument for uint32_t* does
not make sense: this patch adds a new 'unit32_t *a' parameter to foo2
in such tests.
The difference with v[id]dup tests (where we removed 'foo2') is that
in 'foo1' we test the overload with a variable 'wrap' parameter (b)
and we n
Factorize vdwdup and viwdup so that they use the same parameterized
names.
Like with vddup and vidup, we do not bother with the corresponding
expanders, as we stop using them in a subsequent patch.
The patch also adds the missing attributes to vdwdupq_wb_u_insn and
viwdupq_wb_u_insn patterns.
20
In several places we are looking for a type twice or half as large as
the type suffix: this patch introduces helper functions to avoid code
duplication. long_type_suffix is similar to the SVE counterpart, but
adds an 'expected_tclass' parameter. half_type_suffix is similar to
it, but does not exis
This patch adds the vidwdup shape description for vdwdup and viwdup.
It is very similar to viddup, but accounts for the additional 'wrap'
scalar parameter.
2024-08-21 Christophe Lyon
gcc/
* config/arm/arm-mve-builtins-shapes.cc (vidwdup): New.
* config/arm/arm-mve-buil
This patch adds the vadc_vsbc shape description.
2024-08-28 Christophe Lyon
gcc/
* config/arm/arm-mve-builtins-shapes.cc (vadc_vsbc): New.
* config/arm/arm-mve-builtins-shapes.h (vadc_vsbc): New.
---
gcc/config/arm/arm-mve-builtins-shapes.cc | 36 ++
Implement vadciq using the new MVE builtins framework.
2024-08-28 Christophe Lyon
gcc/
* config/arm/arm-mve-builtins-base.cc (class vadc_vsbc_impl): New.
(vadciq): New.
* config/arm/arm-mve-builtins-base.def (vadciq): New.
* config/arm/arm-mve-builtins-b
This patch adds the viddup shape description for vidup and vddup.
This requires the addition of report_not_one_of and
function_checker::require_immediate_one_of to
gcc/config/arm/arm-mve-builtins.cc (they are copies of the aarch64 SVE
counterpart).
This patch also introduces MODE_wb.
2024-08-21
Like with vddup/vidup, we use code_for_mve_q_wb_u_insn, so we can drop
the expanders and their declarations as builtins, now useless.
2024-08-28 Christophe Lyon
gcc/
* config/arm/arm-builtins.cc
(arm_quinop_unone_unone_unone_unone_imm_pred_qualifiers): Delete.
*
Factorize vddup and vidup so that they use the same parameterized
names.
This patch updates only the (define_insn
"@mve_q_u_insn") patterns and does not bother with the
(define_expand "mve_vidupq_n_u") ones, because a subsequent
patch avoids using them.
2024-08-21 Christophe Lyon
gcc/
Implement vcvtaq vcvtmq vcvtnq vcvtpq using the new MVE builtins
framework.
2024-07-11 Christophe Lyon
gcc/
* config/arm/arm-mve-builtins-base.cc (vcvtaq): New.
(vcvtmq): New.
(vcvtnq): New.
(vcvtpq): New.
* config/arm/arm-mve-builtins-base.def (
Implement vddup and vidup using the new MVE builtins framework.
We generate better code because we take advantage of the two outputs
produced by the v[id]dup instructions.
For instance, before:
ldr r3, [r0]
sub r2, r3, #8
str r2, [r0]
mov r2, r3
On Wed, 4 Sep 2024, Evgeny Karpov wrote:
Monday, September 2, 2024
Martin Storsjö wrote:
The only non-obvious thing, is that for IMAGE_REL_ARM64_PAGEBASE_REL21,
i.e. "adrp" instructions, the immediate that gets stored in the
instruction, is the byte offset to the symbol.
After linking, when
We use code_for_mve_q_u_insn, rather than the expanders used by the
previous implementation, so we can remove the expanders and their
declaration as builtins.
2024-08-21 Christophe Lyon
gcc/
* config/arm/arm_mve_builtins.def (vddupq_n_u, vidupq_n_u)
(vddupq_m_n_u, vidup
Implement vctp using the new MVE builtins framework.
2024-08-21 Christophe Lyon
gcc/ChangeLog:
* config/arm/arm-mve-builtins-base.cc (class vctpq_impl): New.
(vctp16q): New.
(vctp32q): New.
(vctp64q): New.
(vctp8q): New.
* config/arm/arm-mve-bui
As discussed in [1], it is better to use "su64" for immediates in
intrinsics signatures in order to provide better diagnostics
(erroneous constants are not truncated for instance). This patch thus
uses su64 instead of ss32 in binary_lshift_unsigned,
binary_rshift_narrow, binary_rshift_narrow_unsig
This patch brings no functional change but removes some code
duplication in arm-mve-builtins-functions.h and makes it easier to
read and maintain.
It introduces a new expand_unspec () member of
unspec_based_mve_function_base and makes a few classes inherit from it
instead of function_base.
This a
Implement vorn using the new MVE builtins framework.
2024-07-11 Christophe Lyon
gcc/
* config/arm/arm-mve-builtins-base.cc (vornq): New.
* config/arm/arm-mve-builtins-base.def (vornq): New.
* config/arm/arm-mve-builtins-base.h (vornq): New.
* config/arm/
Testing v[id]dup overloads with '1' as argument for uint32_t* does not
make sense: instead of choosing the '_wb' overload, we choose the
'_n', but we already do that in the '_n' tests.
This patch removes all such bogus foo2 functions.
2024-08-28 Christophe Lyon
gcc/testsuite/
Implement vbicq using the new MVE builtins framework.
2024-07-11 Christophe Lyon
gcc/
* config/arm/arm-mve-builtins-base.cc (vbicq): New.
* config/arm/arm-mve-builtins-base.def (vbicq): New.
* config/arm/arm-mve-builtins-base.h (vbicq): New.
* config/arm
1 - 100 of 141 matches
Mail list logo