From: Pan Li
This patch would like to combine the vec_duplicate + vmul.vv to the
vmul.vx. From example as below code. The related pattern will depend
on the cost of vec_duplicate from GR2VR. Then the late-combine will
take action if the cost of GR2VR is zero, and reject the combination
if the
On Wed, May 28, 2025 at 5:14 PM Tomasz Kaminski wrote:
>
>
> On Wed, May 28, 2025 at 4:53 PM Patrick Palka wrote:
>
>> On Wed, 28 May 2025, Tomasz Kamiński wrote:
>>
>> > This patch adjust the passing of parameters for the move_only_function,
>> > copyable_function and function_ref. For types th
From: Pan Li
Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx,
with the GR2VR cost is 0, 1 and 2.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-4-i16.c: Add as
From: Pan Li
Add asm dump check test for vec_duplicate + vmul.vv combine to vmul.vx,
with the GR2VR cost is 0, 2 and 15.
The below test suites are passed for this patch.
* The rv64gcv fully regression test.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/rvv/autovec/vx_vf/vx-1-i16.c: Add a
From: Pan Li
This patch would like to introduce the combine of vec_dup + vmul.vv into
vmul.vx on the cost value of GR2VR. The late-combine will take place if
the cost of GR2VR is zero, or reject the combine if non-zero like 1, 15
in test. There will be two cases for the combine:
Case 0:
| .
On Wed, May 28, 2025 at 4:53 PM Patrick Palka wrote:
> On Wed, 28 May 2025, Tomasz Kamiński wrote:
>
> > This patch adjust the passing of parameters for the move_only_function,
> > copyable_function and function_ref. For types that are declared as being
> passed
> > by value in signature template
This patch adjust the passing of parameters for the move_only_function,
copyable_function and function_ref. For types that are declared as being passed
by value in signature template argument, they are passed by value to the
invoker,
when they are small (at most two pointers), trivially move const
On Wed, 28 May 2025, Tomasz Kaminski wrote:
>
>
> On Wed, May 28, 2025 at 4:53 PM Patrick Palka wrote:
> On Wed, 28 May 2025, Tomasz Kamiński wrote:
>
> > This patch adjust the passing of parameters for the
> move_only_function,
> > copyable_function and function_ref. For ty
Hi Tobias,
> you will notice that the PR is not recognized. The format as mentioned before
> is "PR component/number". Namely:
Thanks for the reminder! I'll use `-p` to double-check PR numbers going
forward.
> The second part is not what you are doing, you are actually changing the
> call from
HTEC Public
Hi,
Could you please let us know if you have any comments
on the latest reply on this patch?
Kind regards,
Aleksandar Rakic
From: Aleksandar Rakic
Sent: Tuesday, April 22, 2025 9:00 PM
To: Jeff Law; gcc-patches@gcc.gnu.org
Cc: Djordje Todoro
This is just a rebase of the v1 patch, currently waiting on a conclusion
of the discussion here:
https://gcc.gnu.org/pipermail/gcc-patches/2025-April/682033.html
Tested as a series on aarch64-linux-gnu and x86_64-linux-gnu. OK for
trunk?
Thanks,
Alex
-- >8 --
This adjusts scale_profile_for_vec
On Mon, May 26, 2025 at 4:15 PM Luc Grosheintz
wrote:
> Implement the parts of layout_left that depend on layout_right; and the
> parts of layout_right that don't depend on layout_stride.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/mdspan (layout_right): New class.
> * src/c++23
>
>
> On Tue, May 20, 2025 at 6:32 PM Patrick Palka wrote:
> On Tue, 20 May 2025, Tomasz Kaminski wrote:
>
> > I think I do not have any more suggestions for cases to check, so the
> impl LGTM.
>
> It's cool how many optimizations we came up with for this algorithm :)
>
>
Sorry for the slow reply, had a few days off.
Xi Ruoyao writes:
> If we see a promoted subreg and TRULY_NOOP_TRUNCATION says the
> truncation is not a noop, then all bits of the inner reg are live. We
> cannot reduce the live mask to that of the mode of the subreg.
>
> gcc/ChangeLog:
>
> P
Add the `+cmpbr` option to enable the FEAT_CMPBR architectural
extension.
gcc/ChangeLog:
* config/aarch64/aarch64-option-extensions.def (cmpbr): New
option.
* config/aarch64/aarch64.h (TARGET_CMPBR): New macro.
* doc/invoke.texi (cmpbr): New option.
---
gcc/config
The `far_branch` attribute only ever takes the values 0 or 1, so make it
a `no/yes` valued string attribute instead.
gcc/ChangeLog:
* config/aarch64/aarch64.md (far_branch): Replace 0/1 with
no/yes.
(aarch64_bcond): Handle rename.
(aarch64_cbz1): Likewise.
Add rules for lowering `cbranch4` to CBB/CBH/CB when
CMPBR extension is enabled.
gcc/ChangeLog:
* config/aarch64/aarch64.md (BRANCH_LEN_P_1Kib): New constant.
(BRANCH_LEN_N_1Kib): Likewise.
(cbranch4): Emit CMPBR instructions if possible.
(cbranch4): New expand rul
On Wed, May 28, 2025 at 08:11:05AM -0700, Jerry D wrote:
> The attached patch is simple and self explanatory in the git log entry.
>
> Regression tested on X86_64-linux-gnu.
>
> OK for trunk?
>
Yes, with one question.
> commit 845768cbead03f76265e491bcf5ea6de7020ff39
> Author: Jerry DeLisle
>
Make the formatting of the RTL templates in the rules for branch
instructions more consistent with each other.
gcc/ChangeLog:
* config/aarch64/aarch64.md (cbranch4): Reformat.
(cbranchcc4): Likewise.
(condjump): Likewise.
(*compare_condjump): Likewise.
(aar
* Added a commit to use HS/LO instead of CS/CC mnemonics.
* Rewrite the range checks for immediate RHSes in aarch64.cc: CBGE,
CBHS, CBLE and CBLS have different ranges of allowed immediates than
the other comparisons
Karl Meakin (10):
AArch64: place branch instruction rules together
AArch
Commit the test file `cmpbr.c` before rules for generating the new
instructions are added, so that the changes in codegen are more obvious
in the next commit.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp: Add `cmpbr` to the list of extensions.
* gcc.target/aarch64/cmpbr.c: N
Yes, Steve I have it backward. I will fix it before commit.
On Wed, May 28, 2025, 10:15 AM Steve Kargl
wrote:
> On Wed, May 28, 2025 at 08:11:05AM -0700, Jerry D wrote:
> > The attached patch is simple and self explanatory in the git log entry.
> >
> > Regression tested on X86_64-linux-gnu.
> >
The rules for conditional branches were spread throughout `aarch64.md`.
Group them together so it is easier to understand how `cbranch4`
is lowered to RTL.
gcc/ChangeLog:
* config/aarch64/aarch64.md (condjump): Move.
(*compare_condjump): Likewise.
(aarch64_cb1): Likewise.
The CB family of instructions does not support using the CS or CC
condition codes; instead the synonyms HS and LO must be used. GCC has
traditionally used the CS and CC names. To work around this while
avoiding test churn, add new `j` and `J` format specifiers and use them
when generating CB instru
Extract the hardcoded values for the minimum PC-relative displacements
into named constants and document them.
gcc/ChangeLog:
* config/aarch64/aarch64.md (BRANCH_LEN_P_128MiB): New constant.
(BRANCH_LEN_N_128MiB): Likewise.
(BRANCH_LEN_P_1MiB): Likewise.
(BRANCH_LE
Move the rules for CBZ/TBZ to be above the rules for
CBB/CBH/CB. We want them to have higher priority
because they can express larger displacements.
gcc/ChangeLog:
* config/aarch64/aarch64.md (aarch64_cbz1): Move
above rules for CBB/CBH/CB.
(*aarch64_tbz1): Likewise.
gcc/
Hi Jerry!
On 5/28/25 17:11, Jerry D wrote:
The attached patch is simple and self explanatory in the git log entry.
Regression tested on X86_64-linux-gnu.
OK for trunk?
This LGTM.
Thanks for the patch!
Harald
Regards,
Jerry
Give the `define_insn` rules used in lowering `cbranch4` to RTL
more descriptive and consistent names: from now on, each rule is named
after the AArch64 instruction that it generates. Also add comments to
document each rule.
gcc/ChangeLog:
* config/aarch64/aarch64.md (condjump): Rename to
Hi Tobias,
On 5/28/25 22:46, Tobias Burnus wrote:
Hi Harald,
Harald Anlauf wrote:
This breaks bootstrap here on openSUSE Leap 15.6 with mpfr-4.0.2:
../../gcc-trunk/gcc/fortran/simplify.cc: In function 'gfc_expr*
gfc_simplify_cospi(gfc_expr*)':
../../gcc-trunk/gcc/fortran/simplify.cc:2305:3:
Hi,
since uses of addss for other purposes then modelling FP addition/subtraction
should
be gone now, this patch sets addss cost back to 2.
Bootsrapped/regtested x86_64-linux, comitted.
gcc/ChangeLog:
PR target/119298
* config/i386/x86-tune-costs.h (struct processor_costs): Set
Hi,
autofdo tests are now running only for x86. This patch makes it
run for aarch64 too. Verified that perf and create_gcov are running
as expected.
gcc/ChangeLog:
* config/aarch64/gcc-auto-profile: Make script executable.
gcc/testsuite/ChangeLog:
* lib/target-supports.exp: Enab
On Fri, May 23, 2025 at 10:12 PM Andrew Pinski wrote:
>
> This improves copy prop for aggregates by working over statements that don't
> modify the access
> just like how it is done for copying zeros.
> To speed up things, we should only have one loop back on the vuse instead of
> doing it twice
Tested x86_64-pc-linux-gnu, applying to trunk.
-- 8< --
Typically "does this class have a trivial destructor" is the wrong question
to ask, we rather want "can I destroy this class trivially", thus the
std::is_trivially_destructible standard trait. Let's provide a builtin for
it, and complain ab
The patch looks good to me.
Thanks,
Eugene
From: Kugan Vivekanandarajah
Sent: Sunday, May 25, 2025 9:48 PM
To: Andrew Pinski
Cc: Richard Sandiford ; Andi Kleen
; gcc-patches@gcc.gnu.org; Eugene Rozenfeld
Subject: [EXTERNAL] Re: [AUTOFDO][AARCH64] Add support for profilebootstrap
> On 26
Looks good.
Eugene
-Original Message-
From: Kugan Vivekanandarajah
Sent: Wednesday, May 28, 2025 3:59 PM
To: gcc-patches@gcc.gnu.org
Cc: Jan Hubicka ; Eugene Rozenfeld
Subject: [EXTERNAL] [PATCH] [AUTOFDO] Enable autofdo tests for aarch64
Hi,
autofdo tests are now running only for x8
On Wed, May 21, 2025, at 8:59 PM, Pietro Monteiro wrote:
> Autoreconf -Wall complains about obsolete macros, so replace them according to
> the autoconf documentation[0].
>
> This patch doesn't fully fix all warnings because I focused on doing simple
> fixes and keeping the changes to the generated
101 - 136 of 136 matches
Mail list logo