From: yulong
This patch adds norelax function attribute that be discussed in riscv-c-api-doc
PR#94.
URL:https://github.com/riscv-non-isa/riscv-c-api-doc/pull/94
gcc/ChangeLog:
* config/riscv/riscv.cc (riscv_declare_function_name): Add new
attribute.
---
gcc/config/riscv/riscv.cc
Tested on x86_64-darwin, pushed to trunk, thanks
Iain
--- 8< ---
cdtor_record needs to have an unsigned entry for the position in order to
match with vec_safe_length.
gcc/ChangeLog:
* config/darwin.cc (cdtor_record): Make position unsigned.
Signed-off-by: Iain Sandoe
---
gcc/config/d
Changes since v1:
- Updated the error message to mention that arm_mve_types.h needs to be
included.
- Corrected some spelling errors in commit message.
As the warning for pure functions returning void is not related to this
patch, I'll leave it for you Christophe to look into. :)
Ok for trunk
On Thu, Nov 7, 2024 at 11:13 AM Tejas Belagod wrote:
>
> On 11/7/24 2:36 PM, Richard Biener wrote:
> > On Thu, Nov 7, 2024 at 8:25 AM Tejas Belagod wrote:
> >>
> >> On 11/6/24 6:02 PM, Richard Biener wrote:
> >>> On Wed, Nov 6, 2024 at 12:49 PM Tejas Belagod
> >>> wrote:
>
> Ensure si
Thanks for doing this!
> On Nov 8, 2024, at 00:19, shiyul...@iscas.ac.cn wrote:
>
> From: yulong
>
> This patch adds norelax function attribute that be discussed in
> riscv-c-api-doc PR#94.
> URL:https://github.com/riscv-non-isa/riscv-c-api-doc/pull/94
>
> gcc/ChangeLog:
>
>* config/
On Tue, 5 Nov 2024, Jennifer Schmitz wrote:
> We are working on a patch to improve the codegen for the following test case:
> uint64x2_t foo (uint64x2_t r) {
> uint32x4_t a = vreinterpretq_u32_u64 (r);
> uint32_t t;
> t = a[0]; a[0] = a[1]; a[1] = t;
> t = a[2]; a[2] = a[3]; a[3] =
On 06/11/2024 19:50, Torbjorn SVENSSON wrote:
>
>
> On 2024-11-06 19:06, Richard Earnshaw (lists) wrote:
>> On 06/11/2024 13:50, Torbjorn SVENSSON wrote:
>>>
>>>
>>> On 2024-11-06 14:04, Richard Earnshaw (lists) wrote:
On 06/11/2024 12:23, Torbjorn SVENSSON wrote:
>
>
> On 2024-1
On 11/7/24 2:15 AM, Richard Biener wrote:
ext-dce uses TV_NONE, that's not OK for a pass taking 33% compile-time.
The following adds a timevar to it for proper blaming.
Bootstrap running on x86_64-unknown-linux-gnu.
PR rtl-optimization/117467
* timevar.def (TV_EXT_DCE): New.
Ok for trunk and releases/gcc-14?
--
When building the test case with neon, the 'vst1.32' instruction is used
instead of 'strd'. Allow both variants to make the test pass.
gcc/testsuite/ChangeLog:
* gcc.target/arm/pr40457-2.c: Add vst1.32 as an allowed
instruction.
Signed-off-b
"Robin Dapp" writes:
>>> If the problem is tracking liveness, wouldn't it be better to
>>> iterate over the "then" block in reverse order? We would start
>>> with the liveness set for the join block and update as we move
>>> backwards through the "then" block. This liveness set would
>>> tell us
From: Robin Dapp
This adds zero else operands to masked loads and their intrinsics.
I needed to adjust more than initially thought because we rely on
combine for several instructions and a change in a "base" pattern
needs to propagate to all those.
gcc/ChangeLog:
* config/aarch64/aarch6
From: Robin Dapp
Hi,
changes from v3:
- Check if we support vec_cond_expr for the selected mode in case we
need to set the inactive elements to zero.
- Add another undef operand to gcn.
- Remove unnecessary changes in i386 patch.
Robin Dapp (8):
docs: Document maskload else operand and beh
From: Robin Dapp
This patch amends the documentation for masked loads (maskload,
vec_mask_load_lanes, and mask_gather_load as well as their len
counterparts) with an else operand.
gcc/ChangeLog:
* doc/md.texi: Document masked load else operand.
---
gcc/doc/md.texi | 63
On Thu, 7 Nov 2024, Tamar Christina wrote:
> > -Original Message-
> > From: Richard Biener
> > Sent: Wednesday, November 6, 2024 2:32 PM
> > To: gcc-patches@gcc.gnu.org
> > Cc: RISC-V CI ; Tamar Christina
> > ; Richard Sandiford
> > Subject: [PATCH 5/5] Allow multiple vectorized epilogs
From: Robin Dapp
This patch adds an else operand to vectorized masked load calls.
The current implementation adds else-value arguments to the respective
target-querying functions that is used to supply the vectorizer with the
proper else value.
We query the target for its supported else operand
From: Robin Dapp
gcc/ChangeLog:
* config/i386/sse.md (maskload):
Call maskload..._1.
(maskload_1): Rename.
---
gcc/config/i386/sse.md | 21 ++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.m
From: Robin Dapp
This patch adds else operands to masked loads. Currently the default
else operand predicate just accepts "undefined" (i.e. SCRATCH) values.
PR middle-end/115336
PR middle-end/116059
gcc/ChangeLog:
* config/riscv/autovec.md: Add else operand.
*
From: Robin Dapp
This patch adds else-operand handling to the internal functions.
gcc/ChangeLog:
* internal-fn.cc (add_mask_and_len_args): Rename...
(add_mask_else_and_len_args): ...to this and add else handling.
(expand_partial_load_optab_fn): Use adjusted function.
From: Robin Dapp
This patch adds an undefined else operand to the masked loads.
gcc/ChangeLog:
* config/gcn/predicates.md (maskload_else_operand): New
predicate.
* config/gcn/gcn-valu.md: Use new predicate.
---
gcc/config/gcn/gcn-valu.md | 23 +++
From: Robin Dapp
When predicating a load we implicitly assume that the else value is
zero. This matters in case the loaded value is padded (like e.g.
a Bool) and we must ensure that the padding bytes are zero on targets
that don't implicitly zero inactive elements.
A former version of this patc
On 07/11/2024 17:57, Robin Dapp wrote:
From: Robin Dapp
This patch adds an undefined else operand to the masked loads.
gcc/ChangeLog:
* config/gcn/predicates.md (maskload_else_operand): New
predicate.
* config/gcn/gcn-valu.md: Use new predicate.
---
gcc/config/gcn/gc
These maps will always be non-null in btf_finalize under normal
circumstances, but be safe and verify that before trying to empty them.
Tested on x86_64-linux-gnu and x86_64-linux-gnu host for bpf-unknown-none
target. Pushed as obvious.
gcc/
* btfout.cc (btf_finalize): Check that hash map
On 2024-11-07 16:33, Richard Earnshaw (lists) wrote:
On 06/11/2024 19:50, Torbjorn SVENSSON wrote:
On 2024-11-06 19:06, Richard Earnshaw (lists) wrote:
On 06/11/2024 13:50, Torbjorn SVENSSON wrote:
On 2024-11-06 14:04, Richard Earnshaw (lists) wrote:
On 06/11/2024 12:23, Torbjorn SVENS
The BPF-specific .BTF.ext section is always generated for BPF programs
if -gbtf is specified, and generating it requires BTF information and
assumes that the BTF info has already been generated.
Compiling non-C languages to BPF is not supported, nor is generating
CTF/BTF for non-C. But, compiling
On Fri, Nov 1, 2024 at 4:06 PM Andrew Pinski wrote:
>
> On Tue, Oct 29, 2024 at 10:10 AM Andrew Pinski wrote:
> >
> > On Tue, Oct 29, 2024 at 5:59 AM Richard Biener
> > wrote:
> > >
> > > On Tue, Oct 29, 2024 at 4:29 AM Andrew Pinski
> > > wrote:
> > > >
> > > > r0-126134-g5d2a9da9a7f7c1 added
Does this approach seem reasonable? I'm pretty sure that the way I've
handled the templating here is unideal but I'm not sure what a neat way
to do what I'm trying to do here would be; any comments are welcome.
-- >8 --
Currently, concept failures of standard type traits just report
'expression
On Thu, Nov 7, 2024 at 12:50 AM Richard Biener
wrote:
>
> On Thu, Nov 7, 2024 at 12:43 AM Andrew Pinski
> wrote:
> >
> > After the last patch, we also want to record `(A CMP B) != 0`
> > as `(A CMP B)` and `(A CMP B) == 0` as `(A CMP B)` with the
> > true/false edges swapped.
> >
> > This shows
Hi Faust.
Thanks for the patch. OK for master.
> The BPF-specific .BTF.ext section is always generated for BPF programs
> if -gbtf is specified, and generating it requires BTF information and
> assumes that the BTF info has already been generated.
>
> Compiling non-C languages to BPF is not sup
Hi All,
When the patch for PR114074 was applied we saw a good boost in exchange2.
This boost was partially caused by a simplification of the addressing modes.
With the patch applied IV opts saw the following form for the base addressing;
Base: (integer(kind=4) *) &block + ((sizetype) ((unsigne
On 11/7/24 8:07 AM, Tamar Christina wrote:
-Original Message-
From: Li, Pan2
Sent: Thursday, November 7, 2024 12:57 PM
To: Tamar Christina ; Richard Biener
Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com;
jeffreya...@gmail.com; rdapp@gmail.com
Subject: R
This is the followup as mentioned in
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667987.html .
We need to canonicalize the compares using tree_swap_operands_p instead
of checking CONSTANT_CLASS_P.
Bootstrapped and tested on x86_64-linux-gnu.
gcc/ChangeLog:
* tree-ssa-sccvn.cc
On Thu, 7 Nov 2024 at 20:35, Torbjörn SVENSSON
wrote:
>
> The generated assembler is:
>
> armv7-m:
> push{r4, lr}
> ldr r4, .L6
> ldr r4, [r4]
> lslsr4, r4, #29
> it mi
> addmi r2, r2, #1
> bl bar
> movs
On Wed, Nov 06, 2024 at 06:06:46PM +, Joseph Myers wrote:
> On Wed, 6 Nov 2024, Marek Polacek wrote:
>
> > On Wed, Nov 06, 2024 at 09:42:02AM -0500, Marek Polacek wrote:
> > > On reflection, I'm not so sure about these anymore:
> > >
> > > On Mon, Nov 04, 2024 at 06:26:47PM -0500, Marek Polac
subg (Subtract with Tag) is an Armv8.5-A memory tagging (MTE)
instruction. It can be used to subtract an immediate value scaled by
the tag granule from the address in the source register.
gcc/ChangeLog:
* config/aarch64/aarch64.md (subg): New definition.
---
gcc/config/aarch64/aarch64.m
Store Allocation Tags (st2g) is an Armv8.5-A memory tagging (MTE)
instruction. It stores an allocation tag to two tag granules of memory.
TBD:
- Not too sure what is the best way to generate the st2g yet; A
subsequent patch will emit them in one of the target hooks.
- the current define_in
Add new command line option -fsanitize=memtag with the following
new params:
--param memtag-instrument-stack [0,1] (default 1) to use MTE
insns for enabling dynamic checking of stack variables.
--param memtag-instrument-alloca [0,1] (default 1) to use MTE
insns for enabling dynamic checking of st
libstdc++-v3/ChangeLog:
* include/bits/stl_pair.h (__is_pair): Define for C++11 and
C++14 as well.
---
Tested powerpc64le-linux. Pushed to trunk.
libstdc++-v3/include/bits/stl_pair.h | 6 ++
1 file changed, 6 insertions(+)
diff --git a/libstdc++-v3/include/bits/stl_pair.h
b
Memory tagging is used for detecting memory safety bugs. On AArch64, the
memory tagging extension (MTE) helps in reducing the overheads of memory
tagging:
- CPU: MTE instructions for efficiently tagging and untagging memory.
- Memory: New memory type, Normal Tagged Memory, added to the Arm
Ar
Check for SANITIZER_MEMTAG in the gate function for pass_asan gimple
pass; enable it.
TBD:
- This commit was initially carved out in order to ensure each patch
works in isolation. Need to revisit and double check this.
gcc/ChangeLog:
* asan.cc (memtag_sanitize_p): Fix definition.
Currently, the data type of sanitizer flags is unsigned int, with
SANITIZE_SHADOW_CALL_STACK (1UL << 31) being highest individual
enumerator for enum sanitize_code. Use 'unsigned HOST_WIDE_INT' data
type to allow for more distinct instrumentation modes be added when
needed.
FIXME:
1. Is using d_u
Add a new target hook TARGET_MEMTAG_TAG_MEMORY to tag (and untag)
memory. The default implementation is empty.
Hardware-assisted sanitizers on architectures providing instructions to
tag/untag memory can then make use of this target hook. On AArch64,
e.g., the MEMTAG sanitizer will use this hook
MEMTAG sanitizer, which is based on the HWASAN sanitizer, will invoke
the target-specific hooks to create a random tag, add tag to memory
address, and finally tag and untag memory.
Implement the target hooks to emit MTE instructions if MEMTAG sanitizer
is in effect. Continue to use the default ta
On 10/30/24 3:17 AM, Jakub Jelinek wrote:
Hi!
Since C++20 virtual methods can be constexpr, and if they are
constexpr evaluated, we choose tentative_decl_linkage for those
defer their output and decide at_eof again.
On the following testcases we ICE though, because if
expand_or_defer_fn_1 decide
Add basic tests for MEMTAG sanitizer. MEMTAG sanitizer uses target
hooks to emit AArch64 specific MTE instructions.
Add new target-specific tests.
The currently generated code has quite a few limitations:
1. For basic-1.c testcase, currently we generate:
subgx0, x0, #16, #0
Hi,
Sending the current state of the work.
I would like to get feedback on whether this is generally the right
direction of adding the MEMTAG sanitizer in GCC. I have added some
TBD/FIXME notes to each commit log. These are some of the things I am
aware of and need to be resolved. Please let m
The conversions to key_type and value_type that are performed when
inserting into _Hashtable need to be fixed to do any required
conversions explicitly. The current code assumes that conversions from
the parameter to the key_type or value_type can be done implicitly,
which isn't necessarily true.
Clarify the effects if rehashing is needed. Document the __n_elt
parameter.
libstdc++-v3/ChangeLog:
* include/bits/hashtable.h (_M_insert_unique_node): Improve
comment.
---
Pushed as obvious.
libstdc++-v3/include/bits/hashtable.h | 7 +--
1 file changed, 5 insertions(+), 2 d
"Yuta Mukai (Fujitsu)" writes:
> Thank you for pushing to trunk.
> Can I also ask for a backport to GCC14?
>
> I have attached the patch for GCC14.
> FP8 has been excluded from the list as it is not supported in GCC14.
>
> Bootstrapped/regtested on aarch64-unknown-linux-gnu.
LGTM, thanks. Pushed
On Thu, 7 Nov 2024 at 19:09, Torbjorn SVENSSON
wrote:
>
>
>
> On 2024-11-07 16:33, Richard Earnshaw (lists) wrote:
> > On 06/11/2024 19:50, Torbjorn SVENSSON wrote:
> >>
> >>
> >> On 2024-11-06 19:06, Richard Earnshaw (lists) wrote:
> >>> On 06/11/2024 13:50, Torbjorn SVENSSON wrote:
>
>
I realised that _M_merge_unique and _M_merge_multi call extract(iter)
which then has to call _M_get_previous_node to iterate through the
bucket to find the node before the one iter points to. Since the merge
function is already iterating over the entire container, we had the
previous node a moment
On Thu, 7 Nov 2024 at 18:33, Torbjorn SVENSSON
wrote:
>
>
>
> On 2024-11-07 11:40, Christophe Lyon wrote:
> > Hi Torbjörn,
> >
> > On Thu, 31 Oct 2024 at 19:34, Torbjörn SVENSSON
> > wrote:
> >>
> >> Ok for trunk and releases/gcc-14?
> >>
> >> --
> >>
> >> Test uses MVE, so add effective-target a
The generated assembler is:
armv7-m:
push{r4, lr}
ldr r4, .L6
ldr r4, [r4]
lslsr4, r4, #29
it mi
addmi r2, r2, #1
bl bar
movsr0, #0
pop {r4, pc}
armv8.1-m.main:
push{r3, r4, r5
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, November 6, 2024 2:30 PM
> To: gcc-patches@gcc.gnu.org
> Cc: Richard Sandiford ; Tamar Christina
>
> Subject: [PATCH 3/5] Add LOOP_VINFO_MAIN_LOOP_INFO
>
> The following introduces LOOP_VINFO_MAIN_LOOP_INFO alongside
> LOOP_V
On 2024-11-07 11:40, Christophe Lyon wrote:
Hi Torbjörn,
On Thu, 31 Oct 2024 at 19:34, Torbjörn SVENSSON
wrote:
Ok for trunk and releases/gcc-14?
--
Test uses MVE, so add effective-target arm_fp requirement.
gcc/testsuite/ChangeLog:
* g++.target/arm/mve/general-c++/nomve_fp_1.
> I think it'd be better if I abstain from this. I probably disagree too
> much with the current structure and the way that the code is developing.
> I won't object if anyone else approves it though.
It's not that I'm happy with the current state either and I thought about
how to rewrite it more
> -Original Message-
> From: Li, Pan2
> Sent: Thursday, November 7, 2024 1:45 AM
> To: Tamar Christina ; Richard Biener
>
> Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com;
> jeffreya...@gmail.com; rdapp@gmail.com
> Subject: RE: [PATCH v2 01/10] Match: Simplify
On Thu, Nov 7, 2024 at 12:48 AM Richard Biener
wrote:
>
> On Thu, Nov 7, 2024 at 12:43 AM Andrew Pinski
> wrote:
> >
> > For `(a | b) == 0`, we can "assert" on the true edge that
> > both `a == 0` and `b == 0` but nothing on the false edge.
> > For `(a | b) != 0`, we can "assert" on the false ed
LGTM, thanks!, and I will defer this for a little bit to make the
c-api side stable :)
On Fri, Nov 8, 2024 at 12:19 AM wrote:
>
> From: yulong
>
> This patch adds norelax function attribute that be discussed in
> riscv-c-api-doc PR#94.
> URL:https://github.com/riscv-non-isa/riscv-c-api-doc/pull
Changes since v1:
- Switch to arm_libc_fp_abi from arm_fp
@Christophe, can you test this patch in the linaro farm to ensure that
it does not fail again?
Ok for trunk and releases/gcc-14?
--
This fixes reported regression at
https://linaro.atlassian.net/browse/GNU-1407.
gcc/testsuite/ChangeLog
Thanks Tamar and Jeff for comments.
> I'm not sure it's that simple. It'll depend on the micro-architecture.
> So things like strength of the branch predictors, how fetch blocks are
> handled (can you have embedded not-taken branches, short-forward-branch
> optimizations, etc).
> After:
>
>
On Fri, Nov 8, 2024 at 1:58 AM Robin Dapp wrote:
>
> From: Robin Dapp
>
> gcc/ChangeLog:
>
> * config/i386/sse.md (maskload):
> Call maskload..._1.
> (maskload_1): Rename.
Ok for x86 part.
> ---
> gcc/config/i386/sse.md | 21 ++---
> 1 file changed, 18 ins
> Could you walk me through the failure in more detail? It sounds
> like can_duplicate_and_interleave_p eventually gets to the point of
> subdividing the original elements, instead of either combining consecutive
> elements (the best case), or leaving them as-is (the expected fallback
> for SVE).
Samuel Thibault writes:
> GNU/Mach currently uses a 0..63 range.
>
> gcc/ada/ChangeLog:
>
> * libgnat/system-gnu.ads: New file.
> * Makefile.rtl (x86-gnuhurd): Use libgnat/system-gnu.ads instead of
> libgnat/system-freebsd.ads.
>
> Signed-off-by: Samuel Thibault
> ---
OK witho
Samuel Thibault writes:
> This is essentially the same as the i386-pc-gnu section, the differences
> are the same as between freebsd i386 and freebsd x86_64.
>
> gcc/ada/ChangeLog:
>
> * Makefile.rtl: Add x86_64-pc-gnu section.
>
> Signed-off-by: Samuel Thibault
OK without the ChangeLog
This patch adds the load_ext_gather_offset shape description.
gcc/ChangeLog:
* config/arm/arm-mve-builtins-shapes.cc (struct load_ext_gather):
New.
(struct load_ext_gather_offset_def): New.
* config/arm/arm-mve-builtins-shapes.h (load_ext_gather_offset):
Ne
vstrq_impl derives from store_truncating and vldrq_impl derives from
load_extending which both implement call_properties.
No need to re-implement them in the derived classes.
gcc/ChangeLog:
* config/arm/arm-mve-builtins-base.cc (vstrq_impl): Remove
call_properties.
(vldrq
Soumya AR writes:
> Changes since v1:
>
> This revision makes use of the extended definition of aarch64_ptrue_reg to
> generate predicate registers with the appropriate set bits.
>
> Earlier, there was a suggestion to add support for half floats as well. I
> extended the patch to include HFs but G
On Thu, Nov 07, 2024 at 09:12:34AM +0100, Uros Bizjak wrote:
> On Thu, Nov 7, 2024 at 9:00 AM Jakub Jelinek wrote:
> >
> > On Thu, Nov 07, 2024 at 08:47:34AM +0100, Uros Bizjak wrote:
> > > Maybe we should always recognize "redzone", even for targets without
> > > it. This is the way we recognize
I see your point that the backend can leverage condition move to emit the
branch code.
> For instance see https://godbolt.org/z/fvrq3aq6K
> On ISAs with conditional operations the branch version gets ifconverted.
> On AArch64 we get:
> sat_add_u_1(unsigned int, unsigned int):
> addsw0
Wilco Dijkstra writes:
> The IRA combine_and_move pass runs if the scheduler is disabled and
> aggressively
> combines moves. The movsf/df patterns allow all FP immediates since they rely
> on a split pattern. However splits do not happen during IRA, so the result is
> extra literal loads. To
std::is_permutation is only used in not in
, so move the comment referring to it.
libstdc++-v3/ChangeLog:
* include/bits/hashtable.h: Add is_permutation to comment.
* include/bits/hashtable_policy.h: Remove it from comment.
---
Pushed as obvious.
libstdc++-v3/include/bits/hasht
As there were no further remarks, I have now committed it as
r15-5017-ge52cfd4bc23de1 with minor changes:
* Referring to v6.0 not TR13 (same section numbers),
* fixed one item in the 5.2 to-do list:
'declare mapper with iterator and present modifiers' comes from Appendix B
and we had before a
On Thu, Nov 07, 2024 at 10:54:40AM +, Andrew Stubbs wrote:
> On 07/11/2024 00:37, haochen.jiang wrote:
> > d334f729e53867b838e867375b3f475ba793d96e is the first bad commit
> > commit d334f729e53867b838e867375b3f475ba793d96e
> > Author: Andrew Stubbs
> > Date: Wed Nov 6 12:26:08 2024 +
>
Tejas Belagod writes:
> This patch adds a test case to cover C/C++ operators on SVE ACLE types. This
> does not cover all types, but covers most representative types.
>
> gcc/testsuite:
>
> * gcc.target/aarch64/sve/acle/general/cops.c: New test.
> ---
> .../aarch64/sve/acle/general/cops.c
Tejas Belagod writes:
> When optimizing for NOPs in case of overlapping regs in VEC_SELECT
> expressions,
> validate subreg data before using simplify_subreg_regno. There is no real
> SUBREG rtx here, but a pseudo subreg call to check if subregs are possible.
>
> gcc/ChangeLog:
>
> * rtlan
Tejas Belagod writes:
> Hi,
>
> This patchset enables C/C++ operations on SVE ACLE types.
I've replied to some of the individual patches, but otherwise the
AArch64 parts look good to me.
Thanks,
Richard
On Thu, Nov 7, 2024 at 8:25 AM Tejas Belagod wrote:
>
> On 11/6/24 6:02 PM, Richard Biener wrote:
> > On Wed, Nov 6, 2024 at 12:49 PM Tejas Belagod wrote:
> >>
> >> Ensure sizeless types don't end up trying to be canonicalised to
> >> BIT_FIELD_REFs.
> >
> > You mean variable-sized? But don't w
Implement vstr?q_scatter_base using the new MVE builtins framework.
We need to introduce a new iterator (MVE_4) to support the set needed
by vstr?q_scatter_base (V4SI V4SF V2DI).
gcc/ChangeLog:
* config/arm/arm-builtins.cc (arm_strsbs_qualifiers)
(arm_strsbu_qualifiers, arm_strsb
Implement vldr?q_gather_offset using the new MVE builtins framework.
The patch introduces a new attribute iterator (MVE_u_elem) to
accomodate the fact that ACLE's expected output description uses "uNN"
for all modes, except V8HF where it expects ".f16". Using "V_sz_elem"
would work, but would req
Implement vldr?q_gather_base using the new MVE builtins framework.
The patch updates two testcases rather than using different iterators
for predicated and non-predicated versions. According to ACLE:
vldrdq_gather_base_s64 is expected to generate VLDRD.64
vldrdq_gather_base_z_s64 is expected to ge
On Thu, Nov 7, 2024 at 2:49 AM Li, Pan2 wrote:
>
> Hi Richard,
>
> I would like to double confirm about the doc as I am not the native speaker.
> It may be referenced by all other developers and I am not sure if there is
> something misleading or fuzzy.
> Thanks a lot.
The docs look good to me -
Implement vstr?q_scatter_shifted_offset intrinsics using the MVE
builtins framework.
We use the same approach as the previous patch, and we now have four
sets of patterns:
- vector scatter stores with shifted offset (non-truncating)
- predicated vector scatter stores with shifted offset (non-trunc
Implement vstr?q_scatter_base_wb using the new MVE builtins framework.
The patch introduces a new 'b' type for signatures, which
represents the type of the 'base' argument of vstr?q_scatter_base_wb.
gcc/ChangeLog:
* config/arm/arm-builtins.cc (arm_strsbwbs_qualifiers)
(arm_strsbw
Implement vldr?q_gather_base_wb using the new MVE builtins framework.
gcc/ChangeLog:
* config/arm/arm-builtins.cc (arm_ldrgbwbxu_qualifiers)
(arm_ldrgbwbxu_z_qualifiers, arm_ldrgbwbs_qualifiers)
(arm_ldrgbwbu_qualifiers, arm_ldrgbwbs_z_qualifiers)
(arm_ldrgbwbu_z_q
This patch adds the store_scatter_base shape description.
gcc/ChangeLog:
* config/arm/arm-mve-builtins-shapes.cc (store_scatter_base): New.
* config/arm/arm-mve-builtins-shapes.h (store_scatter_base): New.
---
gcc/config/arm/arm-mve-builtins-shapes.cc | 49 +++
On Thu, Nov 07, 2024 at 11:31:17AM +, Andrew Stubbs wrote:
> Anyway, I think the attached patch should fix it. It passes on my
> configuration, but I don't have a Cascade Lake.
You could have tested with whatever you have (if it has AVX) as -march=
> OK?
Yes, thanks.
Jakub
I intended – but forgot – to actually attach the committed patch. Here
it is …
Tobias Burnus wrote:
As there were no further remarks, I have now committed it as
r15-5017-ge52cfd4bc23de1 with minor changes:
* Referring to v6.0 not TR13 (same section numbers),
* fixed one item in the 5.2 to-do l
Hi!
The following patch on top of the
https://gcc.gnu.org/pipermail/gcc-patches/2024-November/667949.html
patch adds rs6000 part of the support (the only other target I'm aware of
which clearly has red zone as well).
2024-11-07 Jakub Jelinek
* config/rs6000/rs6000.h (struct machine_fu
On Thu, Nov 7, 2024 at 5:50 AM H.J. Lu wrote:
>
> On Wed, Nov 6, 2024 at 6:01 PM Richard Biener
> wrote:
> >
> > On Wed, Nov 6, 2024 at 10:52 AM H.J. Lu wrote:
> > >
> > > On Wed, Nov 6, 2024 at 4:29 PM Richard Biener
> > > wrote:
> > > >
> > > > On Tue, Nov 5, 2024 at 10:50 PM H.J. Lu wrote:
And tweak grammar in a couple of comments.
libstdc++-v3/ChangeLog:
* include/bits/hashtable.h: Fix spelling in comment.
---
Pushed as obvious.
libstdc++-v3/include/bits/hashtable.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/libstdc++-v3/include/bits/hashtabl
Implement vldr?q_gather_shifted_offset using the new MVE builtins
framework.
gcc/ChangeLog:
* config/arm/arm-builtins.cc (arm_ldrgu_qualifiers)
(arm_ldrgs_qualifiers, arm_ldrgs_z_qualifiers)
(arm_ldrgu_z_qualifiers): Delete.
* config/arm/arm-mve-builtins-base.cc (v
This new helper returns true if the mode suffix goes after the
predicate suffix. This is true in most cases, so the base
implementations in nonoverloaded_base and overloaded_base return true.
For instance: vaddq_m_n_s32.
This will be useful in later patches to implement
vstr?q_scatter_offset_p (_
ext-dce uses TV_NONE, that's not OK for a pass taking 33% compile-time.
The following adds a timevar to it for proper blaming.
Bootstrap running on x86_64-unknown-linux-gnu.
PR rtl-optimization/117467
* timevar.def (TV_EXT_DCE): New.
* ext-dce.cc (pass_data_ext_dce): Use T
This patch adds support to check that an immediate is a multiple of a
given value in a given range.
This will be used for instance by scatter_base to check that offset is
in +/-4*[0..127].
Unlike require_immediate_range, require_immediate_range_multiple
accepts signed range bounds to handle the a
This patch adds the store_scatter_offset shape and uses a new helper
class (store_scatter), which will also be used by later patches.
gcc/ChangeLog:
* config/arm/arm-mve-builtins-shapes.cc (struct store_scatter): New.
(struct store_scatter_offset_def): New.
* config/arm/ar
Samuel Thibault writes:
> They are all the same on all BSD-like systems (including GNU/Hurd).
>
> gcc/ada/ChangeLog:
>
> * libgnarl/a-intnam__freebsd.ads: Rename to...
> * libgnarl/a-intnam__bsd.ads: ... new file.
> * libgnarl/a-intnam__dragonfly.ads: Remove file.
> * Make
On 11/7/24 2:36 PM, Richard Biener wrote:
On Thu, Nov 7, 2024 at 8:25 AM Tejas Belagod wrote:
On 11/6/24 6:02 PM, Richard Biener wrote:
On Wed, Nov 6, 2024 at 12:49 PM Tejas Belagod wrote:
Ensure sizeless types don't end up trying to be canonicalised to BIT_FIELD_REFs.
You mean variable-
Hi,
On Fri, 1 Nov 2024 at 22:10, Torbjörn SVENSSON
wrote:
>
> There is one more problem, that this patch does not address, and that is
> that there are warnings like below, but I do not know what's causing them.
>
> .../gcc/testsuite/gcc.target/arm/pr117408-1.c:8:9: warning: 'pure' attribute
>
Bootstrapped and lightly regtested on x86_64-pc-linux-gnu (so far just
dg.exp), OK for trunk if full regtest succeeds?
-- >8 --
Decomposition of lambda closure types is not allowed by
[dcl.struct.bind] p6, since members of a closure have no name.
r244909 made this an error, but missed the case w
> -Original Message-
> From: Richard Biener
> Sent: Wednesday, November 6, 2024 2:32 PM
> To: gcc-patches@gcc.gnu.org
> Cc: RISC-V CI ; Tamar Christina
> ; Richard Sandiford
> Subject: [PATCH 5/5] Allow multiple vectorized epilogs via --param
> vect-epilogues-
> nomask=N
>
> The followi
1 - 100 of 143 matches
Mail list logo