[RFC][PATCH, aarch64] Implement 16-byte vector mode const0 store by TImode

2024-08-13 Thread HAO CHEN GUI
Hi, I submitted a patch to change the mode checking for CLEAR_BY_PIECES. https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660344.html It causes some regressions on aarch64. With the patch, V2x8QImode is used to do clear by pieces instead of TImode as vector mode is preferable and V2x8QImo

Re: [PATCH] [x86] Movement between GENERAL_REGS and SSE_REGS for TImode doesn't need secondary reload.

2024-08-13 Thread Uros Bizjak
On Wed, Aug 14, 2024 at 3:28 AM liuhongt wrote: > > It results in 2 failures for x86_64-pc-linux-gnu{\ > -march=cascadelake}; > > gcc: gcc.target/i386/extendditi3-1.c scan-assembler cqt?o > gcc: gcc.target/i386/pr113560.c scan-assembler-times \tmulq 1 > > For pr113560.c, now GCC generates mulx ins

Re: [PATCH] [x86] Movement between GENERAL_REGS and SSE_REGS for TImode doesn't need secondary reload.

2024-08-13 Thread Uros Bizjak
On Wed, Aug 14, 2024 at 3:28 AM liuhongt wrote: > > It results in 2 failures for x86_64-pc-linux-gnu{\ > -march=cascadelake}; > > gcc: gcc.target/i386/extendditi3-1.c scan-assembler cqt?o > gcc: gcc.target/i386/pr113560.c scan-assembler-times \tmulq 1 > > For pr113560.c, now GCC generates mulx ins

RE: [PATCH v2] aarch64: Improve popcount for bytes [PR113042]

2024-08-13 Thread Andrew Pinski (QUIC)
> -Original Message- > From: Andrew Pinski (QUIC) > Sent: Monday, June 10, 2024 12:23 PM > To: gcc-patches@gcc.gnu.org > Cc: Andrew Pinski (QUIC) > Subject: [PATCH v2] aarch64: Improve popcount for bytes > [PR113042] > > For popcount for bytes, we don't need the reduction addition > afte

Re: v2.1 Draft for a lengthof paper

2024-08-13 Thread Jens Gustedt
Am 14. August 2024 01:27:33 MESZ schrieb Alejandro Colomar : > Hi Xavier, > > On Wed, Aug 14, 2024 at 12:38:53AM GMT, Xavier Del Campo Romero wrote: > > I have been overseeing these last emails - > > Ahhh, good to know; thanks! :) > > > thank you very much for your > > efforts, Alex! > > :-) >

[PATCHv3, expand] Add const0 move checking for CLEAR_BY_PIECES optabs

2024-08-13 Thread HAO CHEN GUI
Hi, This patch adds const0 move checking for CLEAR_BY_PIECES. The original vec_duplicate handles duplicates of non-constant inputs. But 0 is a constant. So even a platform doesn't support vec_duplicate, it could still do clear by pieces if it supports const0 move by that mode. Compared to the

Re: [PATCH v5] genoutput: Accelerate the place_operands function.

2024-08-13 Thread Xianmiao Qu
On Wed, Aug 14, 2024 at 03:51:55AM +0100, Sam James wrote: > > LGTM. > > thanks, > sam It's committed, thanks. BR, Xianmiao

[PATCH] RISC-V: Fix factor in dwarf_poly_indeterminate_value [PR116305]

2024-08-13 Thread 曾治金
This patch is to fix the bug (BugId:116305) introduced by the commit bd93ef for risc-v target. The commit bd93ef changes the chunk_num from 1 to TARGET_MIN_VLEN/128 if TARGET_MIN_VLEN is larger than 128 in riscv_convert_vector_bits. So it changes the value of BYTES_PER_RISCV_VECTOR. For example, b

Re: v2.1 Draft for a lengthof paper

2024-08-13 Thread Jens Gustedt
Hi, Am 14. August 2024 00:38:53 MESZ schrieb Xavier Del Campo Romero : > I have been overseeing these last emails - thank you very much for your > efforts, Alex! I did not reply until now because I do not have prior > experience with gcc internals, so my feedback would probably have not > been t

Re: [PATCH] Fortran: fix minor frontend GMP leaks

2024-08-13 Thread Andre Vehreschild
Hi Harald, I had a hard time to figure why this is correct, when gfc_array_size() returned false, but now I get it. Ok to commit. - Andre On Tue, 13 Aug 2024 21:25:31 +0200 Harald Anlauf wrote: > Dear all, > > while running f951 under valgrind on testcase gfortran.dg/sizeof_6.f90 > I found two

Re: [PATCH] c++/coroutines: fix passing *this to promise type, again [PR116327]

2024-08-13 Thread Jason Merrill
On 8/13/24 7:52 PM, Patrick Palka wrote: On Tue, 13 Aug 2024, Jason Merrill wrote: On 8/12/24 10:01 PM, Patrick Palka wrote: Tested on x86_64-pc-linux-gnu, does this look OK for trunk/14? -- >8 -- In r15-2210 we got rid of the unnecessary cast to lvalue reference when passing *this to the pr

RE: [PATCH v2] RISC-V: Support IMM for operand 0 of ussub pattern

2024-08-13 Thread Li, Pan2
> But you're shifting a REG, not a CONST_INT. I see, we can make a QImode REG to be moved to, and then zero_extend. Thanks Jeff for enlightening me, and will send v3 for this. Pan -Original Message- From: Jeff Law Sent: Wednesday, August 14, 2024 11:52 AM To: Li, Pan2 ; gcc-patches@gcc.

RE: [PATCH v2] RISC-V: Make sure high bits of usadd operands is clean for HI/QI [PR116278]

2024-08-13 Thread Li, Pan2
> How specifically is it avoided for SI? ISTM it should have the exact > same problem with a constant like 0x8000 in SImode on rv64 which is > going to be extended to 0x8000. HI and QI need some special handling for sum. For example, for HImode. 65535 + 2 = 65537, when compare

Re: [PATCH 1/3] Write CodeView information about local static variables

2024-08-13 Thread Jeff Law
On 8/12/24 6:24 PM, Mark Harmstone wrote: Outputs CodeView S_LDATA32 symbols, for static variables within functions, along with S_BLOCK32 and S_END for the beginning and end of lexical blocks. gcc/ * dwarf2codeview.cc (enum cv_sym_type): Add S_END and S_BLOCK32. (write_local_s

Re: [PATCH] Restrict pr116202-run-1.c test to riscv_v target

2024-08-13 Thread Jeff Law
On 8/12/24 2:31 PM, Mark Wielaard wrote: The testcase uses -march=rv64gcv and dg-do run, so should be restricted to a riscv_v target. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/pr116202-run-1.c (dg-do run): Add target riscv_v. OK jeff

Re: [PATCH] Fix maybe-uninitialized CodeView LF_INDEX warning

2024-08-13 Thread Jeff Law
On 8/12/24 4:45 PM, Mark Harmstone wrote: Initialize last_type to 0 to silence two spurious maybe-uninitialized warnings. We issue an LF_INDEX continuation subtype for any LF_FIELDLISTs that overflow, so LF_INDEXes will always have a subtype preceding them (and thus last_type will always be se

Re: [PATCH v2] RISC-V: Make sure high bits of usadd operands is clean for HI/QI [PR116278]

2024-08-13 Thread Jeff Law
On 8/12/24 8:09 PM, Li, Pan2 wrote: Isn't this wrong for SImode on rv64? It seems to me the right test is mode != word_mode? Assuming that works, it's OK for the trunk. Thanks Jeff, Simode version of test file doesn't have this issue. Thus, only HI and QI here. I will add a new test for SI

Re: [PATCH] Re-add calling emit_clobber in lower-subreg.cc's resolve_simple_move.

2024-08-13 Thread Jeff Law
On 8/12/24 10:12 AM, Xianmiao Qu wrote: The previous patch: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=d8a6945c6ea22efa4d5e42fe1922d2b27953c8cd aimed to eliminate redundant MOV instructions by removing calling emit_clobber in lower-subreg.cc's resolve_simple_move. First, I found that anoth

Re: [PING^4][PATCH v2] docs: Update function multiversioning documentation

2024-08-13 Thread Sandra Loosemore
On 8/13/24 11:18, Andrew Carlotti wrote: I'm still waiting for review for this patch. I've asked Richard Sandiford about it, and he'd like a docs maintainer to review the patch (so I've cc'd the rest of them now as well). I'm sorry, I've seen this patch go by before, but every time I've looked

Re: [PATCH v2] RISC-V: Support IMM for operand 0 of ussub pattern

2024-08-13 Thread Jeff Law
On 8/13/24 9:47 PM, Li, Pan2 wrote: +static rtx +riscv_gen_unsigned_xmode_reg (rtx x, machine_mode mode) +{ + if (!CONST_INT_P (x)) +return gen_lowpart (Xmode, x); + + rtx xmode_x = gen_reg_rtx (Xmode); + HOST_WIDE_INT cst = INTVAL (x); + + emit_move_insn (xmode_x, x); + + int xmode_b

RE: [PATCH v2] RISC-V: Support IMM for operand 0 of ussub pattern

2024-08-13 Thread Li, Pan2
>> +static rtx >> +riscv_gen_unsigned_xmode_reg (rtx x, machine_mode mode) >> +{ >> + if (!CONST_INT_P (x)) >> +return gen_lowpart (Xmode, x); >> + >> + rtx xmode_x = gen_reg_rtx (Xmode); >> + HOST_WIDE_INT cst = INTVAL (x); >> + >> + emit_move_insn (xmode_x, x); >> + >> + int xmode_bits =

Re: [PATCH v2] RISC-V: Support IMM for operand 0 of ussub pattern

2024-08-13 Thread Jeff Law
On 8/13/24 8:23 PM, Li, Pan2 wrote: This Patch may requires rebase, will send v3 for conflict resolving. Pan -Original Message- From: Li, Pan2 Sent: Sunday, August 4, 2024 7:48 PM To: gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; rda

Re: [PATCH v5] genoutput: Accelerate the place_operands function.

2024-08-13 Thread Sam James
Xianmiao Qu writes: > With the increase in the number of modes and patterns for some > backend architectures, the place_operands function becomes a > bottleneck int the speed of genoutput, and may even become a > bottleneck int the overall speed of building the GCC project. > This patch aims to a

RE: [PATCH v2] RISC-V: Support IMM for operand 0 of ussub pattern

2024-08-13 Thread Li, Pan2
This Patch may requires rebase, will send v3 for conflict resolving. Pan -Original Message- From: Li, Pan2 Sent: Sunday, August 4, 2024 7:48 PM To: gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; jeffreya...@gmail.com; rdapp@gmail.com; Li, Pan2 Subject: [PA

[PATCH v5] genoutput: Accelerate the place_operands function.

2024-08-13 Thread Xianmiao Qu
With the increase in the number of modes and patterns for some backend architectures, the place_operands function becomes a bottleneck int the speed of genoutput, and may even become a bottleneck int the overall speed of building the GCC project. This patch aims to accelerate the place_operands fun

Re: [PATCH v4] genoutput: Accelerate the place_operands function.

2024-08-13 Thread Xianmiao Qu
On Tue, Aug 13, 2024 at 06:57:40PM +0100, Sam James wrote: > > /* No instruction can have more operands than this. Sorry for this > > arbitrary limit, but what machine will have an instruction with > > @@ -112,6 +113,8 @@ static int next_operand_number = 1; > > struct operand_data > > { > >

[PATCH] PR target/89213 - Enhance V2DI/V2DF constant shifts

2024-08-13 Thread Michael Meissner
This patch fixes PR target/89213 to allow better code to be generated to do constant shifts of V2DI/V2DF vectors. Previously GCC would do constant shifts of vectors with 64-bit elements by using: XXSPLTIB 32,4 VEXTSB2D 0,0 VSRAD 2,2,0 I.e., the PowerPC does not have a VSP

Re: [PATCH/RFC] LRA: Don't emit move for substituted CONSTATNT_P operand [PR116170]

2024-08-13 Thread Kewen.Lin
on 2024/8/13 18:02, Richard Sandiford wrote: > "Kewen.Lin" writes: >> on 2024/8/12 21:02, Richard Sandiford wrote: >>> "Kewen.Lin" writes: Hi Richard, Thanks for the comments! on 2024/8/12 16:55, Richard Sandiford wrote: > "Kewen.Lin" writes: >> Hi, >> >>

[PATCH] aarch64: Improve vector constant generation using SVE INDEX instruction [PR113328]

2024-08-13 Thread Pengxuan Zheng
SVE's INDEX instruction can be used to populate vectors by values starting from "base" and incremented by "step" for each subsequent value. We can take advantage of it to generate vector constants if TARGET_SVE is available and the base and step values are within [-16, 15]. For example, with the f

RE: [PATCH] i386: Fix some vex insns that prohibit egpr

2024-08-13 Thread Liu, Hongtao
> -Original Message- > From: Kong, Lingling > Sent: Wednesday, August 14, 2024 9:38 AM > To: gcc-patches@gcc.gnu.org > Cc: Liu, Hongtao ; Jiang, Haochen > > Subject: [PATCH] i386: Fix some vex insns that prohibit egpr > > Although these vex insn have evex counterpart, but when it uses

[PATCH] i386: Fix some vex insns that prohibit egpr

2024-08-13 Thread Kong, Lingling
Although these vex insn have evex counterpart, but when it uses the displayed vex prefix should not support APX EGPR. Like TARGET_AVXVNNI, TARGET_IFMA and TARGET_AVXNECONVERT. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: * config/i386/sse.md (vp

Re: [PATCH 4/4] i386: Optimization for APX NDD is always zero-uppered for shift

2024-08-13 Thread Hongtao Liu
On Mon, Aug 12, 2024 at 3:12 PM kong lingling wrote: > > gcc/ChangeLog: > > > PR target/113729 > >* config/i386/i386.md (*ashlqi3_1_zext): > >New define_insn. > >(*ashlhi3_1_zext): Ditto. > >(*qi3_1_zext): Ditto. > >

Re: [PATCH 3/4] i386: Optimization for APX NDD is always zero-uppered for logic

2024-08-13 Thread Hongtao Liu
On Mon, Aug 12, 2024 at 3:12 PM kong lingling wrote: > > gcc/ChangeLog: > > >PR target/113729 > >* config/i386/i386.md (*andqi_1_zext): > >New define_insn. > >(*andhi_1_zext): Ditto. > >(*qi_1_zext): Ditto. > >

Re: [PATCH 2/4] i386: Optimization for APX NDD is always zero-uppered for sub/adc/sbb

2024-08-13 Thread Hongtao Liu
On Mon, Aug 12, 2024 at 3:12 PM kong lingling wrote: > > gcc/ChangeLog: > > > >PR target/113729 > >* config/i386/i386.md (*subqi_1_zext): New > >define_insn. > >(*subhi_1_zext): Ditto. > >(*addqi3_carry_zext): Ditto. >

Re: [PATCH 1/4] i386: Optimization for APX NDD is always zero-uppered for ADD

2024-08-13 Thread Hongtao Liu
On Mon, Aug 12, 2024 at 3:10 PM kong lingling wrote: > > For APX instruction with an NDD, the destination GPR will get the > instruction’s result in bits [OSIZE-1:0] and, if OSIZE < 64b, have its upper > bits [63:OSIZE] zeroed. Now supporting other NDD instructions. > > > Bootstrapped and regtes

RE: [PATCH 1/4] i386: Optimization for APX NDD is always zero-uppered for ADD

2024-08-13 Thread Kong, Lingling
Hi, Gently ping. Thanks, Lingling From: kong lingling Sent: Monday, August 12, 2024 3:10 PM To: gcc-patches@gcc.gnu.org Cc: H. J. Lu ; Kong, Lingling ; Liu, Hongtao Subject: [PATCH 1/4] i386: Optimization for APX NDD is always zero-uppered for ADD For APX instruction with an NDD, the destin

[PATCH] [x86] Movement between GENERAL_REGS and SSE_REGS for TImode doesn't need secondary reload.

2024-08-13 Thread liuhongt
It results in 2 failures for x86_64-pc-linux-gnu{\ -march=cascadelake}; gcc: gcc.target/i386/extendditi3-1.c scan-assembler cqt?o gcc: gcc.target/i386/pr113560.c scan-assembler-times \tmulq 1 For pr113560.c, now GCC generates mulx instead of mulq with -march=cascadelake, which should be optimal,

Re: [PATCH] Move ix86_align_loops into a separate pass and insert the pass after pass_endbr_and_patchable_area.

2024-08-13 Thread Hongtao Liu
On Mon, Aug 12, 2024 at 10:10 PM liuhongt wrote: > > > Are there any assumptions that BB_HEAD must be a note or label? > > Maybe we should move ix86_align_loops into a separate pass and insert > > the pass just before pass_final. > The patch inserts .p2align after endbr pass, it can also fix the i

RE: [PATCH V2 05/10] i386: Fix dot_prod backend patterns for mmx and sse targets

2024-08-13 Thread Liu, Hongtao
> -Original Message- > From: Victor Do Nascimento > Sent: Tuesday, August 13, 2024 8:42 PM > To: gcc-patches@gcc.gnu.org > Cc: tamar.christ...@arm.com; claz...@gmail.com; Liu, Hongtao > ; s...@gcc.gnu.org; bernds_...@t-online.de; > al...@redhat.com; Victor Do Nascimento > Subject: [PAT

Re: [PATCH] c++/coroutines: fix passing *this to promise type, again [PR116327]

2024-08-13 Thread Patrick Palka
On Tue, 13 Aug 2024, Jason Merrill wrote: > On 8/12/24 10:01 PM, Patrick Palka wrote: > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk/14? > > > > -- >8 -- > > > > In r15-2210 we got rid of the unnecessary cast to lvalue reference when > > passing *this to the promise type ctor, an

Re: v2.1 Draft for a lengthof paper

2024-08-13 Thread Alejandro Colomar
Hi Xavier, On Wed, Aug 14, 2024 at 12:38:53AM GMT, Xavier Del Campo Romero wrote: > I have been overseeing these last emails - Ahhh, good to know; thanks! :) > thank you very much for your > efforts, Alex! :-) > I did not reply until now because I do not have prior > experience with gcc inter

Re: v2.1 Draft for a lengthof paper

2024-08-13 Thread Xavier Del Campo Romero
I have been overseeing these last emails - thank you very much for your efforts, Alex! I did not reply until now because I do not have prior experience with gcc internals, so my feedback would probably have not been that useful. Those emails from 2020 were in fact discussing two completely differe

Re: Ping: [PATCH] testsuite: Fix struct size check [PR116155]

2024-08-13 Thread Qing Zhao
> On Aug 13, 2024, at 17:05, Dimitar Dimitrov wrote: > > On Tue, Aug 13, 2024 at 07:34:09PM +0200, Hans-Peter Nilsson wrote: >>> From: Sam James >>> Date: Tue, 13 Aug 2024 18:17:29 +0100 >> >>> Hans-Peter Nilsson writes: >>> I stumbled on this being a regression for cris-elf as well;

Re: [PATCH] c++/coroutines: fix passing *this to promise type, again [PR116327]

2024-08-13 Thread Jason Merrill
On 8/12/24 10:01 PM, Patrick Palka wrote: Tested on x86_64-pc-linux-gnu, does this look OK for trunk/14? -- >8 -- In r15-2210 we got rid of the unnecessary cast to lvalue reference when passing *this to the promise type ctor, and as a drive-by change we also simplified the code to use cp_build_

Re: [PATCH v2] c++: ICE with NSDMIs and fn arguments [PR116015]

2024-08-13 Thread Jason Merrill
On 8/12/24 7:21 PM, Marek Polacek wrote: On Fri, Aug 09, 2024 at 05:15:05PM -0400, Jason Merrill wrote: On 8/9/24 4:21 PM, Marek Polacek wrote: On Fri, Aug 09, 2024 at 12:58:34PM -0400, Jason Merrill wrote: On 8/8/24 1:37 PM, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu,

Re: Ping: [PATCH] testsuite: Fix struct size check [PR116155]

2024-08-13 Thread Dimitar Dimitrov
On Tue, Aug 13, 2024 at 07:34:09PM +0200, Hans-Peter Nilsson wrote: > > From: Sam James > > Date: Tue, 13 Aug 2024 18:17:29 +0100 > > > Hans-Peter Nilsson writes: > > > > > I stumbled on this being a regression for cris-elf as well; > > > the patch expectedly fixes the test-case for CRIS as wel

Re: [committed][rtl-optimization/116244] Don't create bogus regs in alter_subreg

2024-08-13 Thread Jeff Law
On 8/13/24 3:19 AM, Richard Sandiford wrote: And the inconsistency was driving me bananas as my mental model is that (reg:DI N) covers N and N+1 and all that changes in the order based on endianness. ie, if we have (set (reg:DI 0) (...)) that changes d0/d1. But maybe that's just 20 year

Re: [PATCH] ifcvt: Fix force_operand ICE due to noce_convert_multiple_sets [PR116353]

2024-08-13 Thread Jakub Jelinek
On Tue, Aug 13, 2024 at 12:40:35PM -0700, H.J. Lu wrote: > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/i386/pr116353.c > > Does this test contain x86 specific code? Guess int32plus dependent at least (on 16-bit int *in++ << 16 would be out of bounds shift). Jakub

Re: [PATCH] ifcvt: Fix force_operand ICE due to noce_convert_multiple_sets [PR116353]

2024-08-13 Thread Sam James
"H.J. Lu" writes: > On Tue, Aug 13, 2024 at 4:57 AM Manolis Tsamis > wrote: >> >> Now that more operations are allowed for noce_convert_multiple_sets, we need >> to >> check noce_can_force_operand on the sequence before calling >> try_emit_cmove_seq. >> Otherwise an inappropriate argument may

Re: [PATCH] ifcvt: Fix force_operand ICE due to noce_convert_multiple_sets [PR116353]

2024-08-13 Thread Philipp Tomsich
Applied to master, thanks. --Philipp. On Tue, 13 Aug 2024 at 21:48, Jeff Law wrote: > > > On 8/13/24 5:57 AM, Manolis Tsamis wrote: > > Now that more operations are allowed for noce_convert_multiple_sets, we > need to > > check noce_can_force_operand on the sequence before calling > try_emit_cmo

Re: [PATCH] ifcvt: Fix force_operand ICE due to noce_convert_multiple_sets [PR116353]

2024-08-13 Thread Jeff Law
On 8/13/24 5:57 AM, Manolis Tsamis wrote: Now that more operations are allowed for noce_convert_multiple_sets, we need to check noce_can_force_operand on the sequence before calling try_emit_cmove_seq. Otherwise an inappropriate argument may be given to copy_to_mode_reg and result in an ICE.

Re: [PATCH] ifcvt: Fix force_operand ICE due to noce_convert_multiple_sets [PR116353]

2024-08-13 Thread H.J. Lu
On Tue, Aug 13, 2024 at 4:57 AM Manolis Tsamis wrote: > > Now that more operations are allowed for noce_convert_multiple_sets, we need > to > check noce_can_force_operand on the sequence before calling > try_emit_cmove_seq. > Otherwise an inappropriate argument may be given to copy_to_mode_reg a

Re: [PATCH 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-08-13 Thread Jeff Law
On 8/11/24 3:00 PM, Robin Dapp wrote: When predicating a load we implicitly assume that the else value is zero. In order to formalize this this patch queries the target for its supported else operand and uses that for the maskload call. Subsequently, if the else operand is nonzero, a cond_exp

[PATCH] Fortran: fix minor frontend GMP leaks

2024-08-13 Thread Harald Anlauf
Dear all, while running f951 under valgrind on testcase gfortran.dg/sizeof_6.f90 I found two minor memleaks with GMP variables that were not cleared. Regtested on x86_64-pc-linux-gnu. I intend to commit to mainline soon unless there are comments. (And no, this does not address the recent interm

Re: [PATCH] Fortran: reject array constructor value of abstract type [PR114308]

2024-08-13 Thread Harald Anlauf
Pushed after an OK by Steve in the PR as r15-2902-g9988d7e004796ab531df7bcda45788a7aa9276d7 Am 13.08.24 um 19:25 schrieb Harald Anlauf: Dear all, the attached patch checks whether the declared type of an array constructor value is abstract, which is forbidden by the standard. Steve found the r

Re: Ping: [PATCH] testsuite: Fix struct size check [PR116155]

2024-08-13 Thread Sam James
Hans-Peter Nilsson writes: >> From: Sam James >> Date: Tue, 13 Aug 2024 18:17:29 +0100 > >> Hans-Peter Nilsson writes: >> >> > I stumbled on this being a regression for cris-elf as well; >> > the patch expectedly fixes the test-case for CRIS as well. >> > It's been a week since the patch was p

Re: [PATCH v4] genoutput: Accelerate the place_operands function.

2024-08-13 Thread Sam James
Xianmiao Qu writes: > With the increase in the number of modes and patterns for some > backend architectures, the place_operands function becomes a > bottleneck int the speed of genoutput, and may even become a > bottleneck int the overall speed of building the GCC project. > This patch aims to a

[PATCH v4] genoutput: Accelerate the place_operands function.

2024-08-13 Thread Xianmiao Qu
With the increase in the number of modes and patterns for some backend architectures, the place_operands function becomes a bottleneck int the speed of genoutput, and may even become a bottleneck int the overall speed of building the GCC project. This patch aims to accelerate the place_operands fun

Re: [PATCH v3] genoutput: Accelerate the place_operands function.

2024-08-13 Thread Richard Sandiford
Xianmiao Qu writes: > With the increase in the number of modes and patterns for some > backend architectures, the place_operands function becomes a > bottleneck int the speed of genoutput, and may even become a > bottleneck int the overall speed of building the GCC project. > This patch aims to ac

[PING] [PATCH v2] Support if conversion for switches

2024-08-13 Thread Andi Kleen
Andi Kleen writes: I wanted to ping this patch. I believe Richard ok'ed most of it earlier but need an ok for the changes resulting from his review too (but they were mostly only test suite and comment fixes apart from some minor tweaks) -Andi > The gimple-if-to-switch pass converts if statemen

Re: Ping: [PATCH] testsuite: Fix struct size check [PR116155]

2024-08-13 Thread Hans-Peter Nilsson
> From: Sam James > Date: Tue, 13 Aug 2024 18:17:29 +0100 > Hans-Peter Nilsson writes: > > > I stumbled on this being a regression for cris-elf as well; > > the patch expectedly fixes the test-case for CRIS as well. > > It's been a week since the patch was posted and as I see no > > replies, I'

[PATCH] Fortran: reject array constructor value of abstract type [PR114308]

2024-08-13 Thread Harald Anlauf
Dear all, the attached patch checks whether the declared type of an array constructor value is abstract, which is forbidden by the standard. Steve found the relevant constraint in F2023, but it exists already in F2018. Regtested on x86_64-pc-linux-gnu. OK for mainline? Thanks, Harald From 9988

Re: Ping^4 [PATCH-2v4] Value Range: Add range op for builtin isfinite

2024-08-13 Thread Vineet Gupta
Hi Hao Gui, Can you commit this soon - some of the arch patches might be waiting on this. Thx, -Vineet On 8/5/24 07:59, Jeff Law wrote: > On 7/21/24 8:10 PM, HAO CHEN GUI wrote: >> Hi, >>Gently ping it. >> https://gcc.gnu.org/pipermail/gcc-patches/2024-May/653094.html > OK. Sorry for the de

[PATCH v3] genoutput: Accelerate the place_operands function.

2024-08-13 Thread Xianmiao Qu
With the increase in the number of modes and patterns for some backend architectures, the place_operands function becomes a bottleneck int the speed of genoutput, and may even become a bottleneck int the overall speed of building the GCC project. This patch aims to accelerate the place_operands fun

[PING^4][PATCH v2] docs: Update function multiversioning documentation

2024-08-13 Thread Andrew Carlotti
I'm still waiting for review for this patch. I've asked Richard Sandiford about it, and he'd like a docs maintainer to review the patch (so I've cc'd the rest of them now as well). On Wed, Jul 10, 2024 at 01:09:41PM +0100, Andrew Carlotti wrote: > > On Mon, Jun 10, 2024 at 05:08:21PM +0100, Andr

Re: Ping: [PATCH] testsuite: Fix struct size check [PR116155]

2024-08-13 Thread Sam James
Hans-Peter Nilsson writes: > I stumbled on this being a regression for cris-elf as well; > the patch expectedly fixes the test-case for CRIS as well. > It's been a week since the patch was posted and as I see no > replies, I'm pinging this in behalf of Dimitar. I can't formally approve it but I

Re: [PATCH v2] genoutput: Accelerate the place_operands function.

2024-08-13 Thread Sam James
Xianmiao Qu writes: > On Wed, Aug 14, 2024 at 01:01:35AM +0800, Xianmiao Qu wrote: >> static void scan_operands (class data *, rtx, int, int); >> -static int compare_operands (struct operand_data *, >> - struct operand_data *); >> static void place_operands (class data *

Re: [PATCH v2] genoutput: Accelerate the place_operands function.

2024-08-13 Thread Xianmiao Qu
On Wed, Aug 14, 2024 at 01:01:35AM +0800, Xianmiao Qu wrote: > static void scan_operands (class data *, rtx, int, int); > -static int compare_operands (struct operand_data *, > - struct operand_data *); > static void place_operands (class data *); Oh, there is an mistake

Re: rtl: Enable the use of rtx values with int and mode attributes

2024-08-13 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > Hi, > > The 'code' part of a 'define_code_attr' refers to the type of the key, > in other words, it uses a code_iterator to pick the value from their > (key "value") pair list. > Though it seems rtx_alloc_for_name requires a code_attribute to be used > when the

[PATCH v2] genoutput: Accelerate the place_operands function.

2024-08-13 Thread Xianmiao Qu
With the increase in the number of modes and patterns for some backend architectures, the place_operands function becomes a bottleneck int the speed of genoutput, and may even become a bottleneck int the overall speed of building the GCC project. This patch aims to accelerate the place_operands fun

Re: [PATCH, gfortran] libgfortran: implement fpu-macppc for Darwin, support IEEE arithmetic

2024-08-13 Thread FX Coudert
Hi, > I dropped a change to the test file, since you have fixed it appropriately, > and switched to Apple libm convention for flags, as you have suggested. > Please let me know if I should do anything further to improve it and make it > acceptable for a merge. The patch itself is OK. Please add

Re: [PATCH] ifcvt: Fix force_operand ICE due to noce_convert_multiple_sets [PR116353]

2024-08-13 Thread Sam James
Manolis Tsamis writes: > Now that more operations are allowed for noce_convert_multiple_sets, we need > to > check noce_can_force_operand on the sequence before calling > try_emit_cmove_seq. > Otherwise an inappropriate argument may be given to copy_to_mode_reg and > result > in an ICE. > > Fi

rtl: Enable the use of rtx values with int and mode attributes

2024-08-13 Thread Andre Vieira (lists)
Hi, The 'code' part of a 'define_code_attr' refers to the type of the key, in other words, it uses a code_iterator to pick the value from their (key "value") pair list. Though it seems rtx_alloc_for_name requires a code_attribute to be used when the 'value' needs to be a type. In other words,

Re: [nvptx] Pass -m32/-m64 to host_compiler if it has multilib support

2024-08-13 Thread Richard Biener
> Am 13.08.2024 um 17:48 schrieb Thomas Schwinge : > > Hi Prathamesh! > > On 2024-08-12T07:50:07+, Prathamesh Kulkarni > wrote: >>> From: Thomas Schwinge >>> Sent: Friday, August 9, 2024 12:55 AM > >>> On 2024-08-08T06:46:25-0700, Andrew Pinski wrote: On Thu, Aug 8, 2024 at 6:11

Re: [Fortran, Patch, PR102973, v1] Reset flag for parsing proc_ptrs in associate in error case

2024-08-13 Thread Harald Anlauf
Hi Andre, Am 13.08.24 um 15:15 schrieb Andre Vehreschild: Hi all, attached patch is the last one the meta-bug 87477 ASSOCIATE depends on. The resolution was already given in the PR, so I just beautified it and made patch for it. I tried to come up with a testcase as well as Harald has, but had

Re: [PATCH] testsuite: Avoid running neon test on Cortex-M55

2024-08-13 Thread Andre Vieira (lists)
I'm not a maintainer but I'd argue the entire test is bogus. The error reporting in this area seems to be somewhat fragile, if you compile it with '-march=armv7-a -mfloat-abi=soft', you also don't get the error this is testing for.  I'd argue this kind of user friendly error message should jus

Re: [PATCH v1 4/4] dwarf2: store the RA state in CFI row

2024-08-13 Thread Richard Sandiford
Matthieu Longo writes: > On AArch64, the RA state informs the unwinder whether the return address > is mangled and how, or not. This information is encoded in a boolean in > the CFI row. This binary approach prevents from expressing more complex > configuration, as it is the case with PAuth_LR int

Re: [PATCH v1 3/4] aarch64 testsuite: explain expectections for pr94515* tests

2024-08-13 Thread Richard Sandiford
Matthieu Longo writes: > gcc/testsuite/ChangeLog: > > * g++.target/aarch64/pr94515-1.C: Improve test documentation. > * g++.target/aarch64/pr94515-2.C: Same. The patch is OK as-is, since it's clearly a strict improvement over the status quo. But a suggestion below: > --- > gcc/

RE: [nvptx] Pass -m32/-m64 to host_compiler if it has multilib support

2024-08-13 Thread Thomas Schwinge
Hi Prathamesh! On 2024-08-12T07:50:07+, Prathamesh Kulkarni wrote: >> From: Thomas Schwinge >> Sent: Friday, August 9, 2024 12:55 AM >> On 2024-08-08T06:46:25-0700, Andrew Pinski wrote: >> > On Thu, Aug 8, 2024 at 6:11 AM Prathamesh Kulkarni >> > wrote: >> >> After differing NUM_POLY_INT_

[PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-13 Thread Qing Zhao
With the addition of the 'counted_by' attribute and its wide roll-out within the Linux kernel, a use case has been found that would be very nice to have for object allocators: being able to set the counted_by counter variable without knowing its name. For example, given: struct foo { ...

Re: [PATCH v1 1/4] Rename REG_CFA_TOGGLE_RA_MANGLE to REG_CFA_NEGATE_RA_STATE

2024-08-13 Thread Richard Sandiford
Matthieu Longo writes: > The current name REG_CFA_TOGGLE_RA_MANGLE is not representative of what > it really is, i.e. a register to represent several states, not only a > binary one. Same for dwarf2out_frame_debug_cfa_toggle_ra_mangle. > > gcc/ChangeLog: > > * combine-stack-adj.cc >

Re: [PATCH v1 2/4] dwarf2: add hooks for architecture-specific CFIs

2024-08-13 Thread Richard Sandiford
Matthieu Longo writes: > Architecture-specific CFI directives are currently declared an processed > among others architecture-independent CFI directives in gcc/dwarf2* files. > This approach creates confusion, specifically in the case of DWARF > instructions in the vendor space and using the same

Ping: [PATCH] testsuite: Fix struct size check [PR116155]

2024-08-13 Thread Hans-Peter Nilsson
I stumbled on this being a regression for cris-elf as well; the patch expectedly fixes the test-case for CRIS as well. It's been a week since the patch was posted and as I see no replies, I'm pinging this in behalf of Dimitar. > From: Dimitar Dimitrov > Date: Mon, 5 Aug 2024 21:29:35 +0300 > Th

v2.1 Draft for a lengthof paper

2024-08-13 Thread Alejandro Colomar
Hi, On Tue, Aug 13, 2024 at 01:34:58AM GMT, Alejandro Colomar wrote: > I want to send an updated version of n2529. The original author didn't > respond to my mail, so I'll take over. I've been preparing a GCC patch > set for adding the feature to GCC, and have informed Clang developers > about i

Re: [PATCH 05/15] arm: [MVE intrinsics] add vcvt shape

2024-08-13 Thread Richard Sandiford
"Andre Vieira (lists)" writes: > On 11/07/2024 22:42, Christophe Lyon wrote: >> + bool >> + check (function_checker &c) const override >> + { >> +if (c.mode_suffix_id == MODE_none) >> + return true; >> + >> +unsigned int bits = c.type_suffix (0).element_bits; >> +return c.requi

[Committed 2/3] RISC-V: Fix non-obvious comment typos

2024-08-13 Thread Patrick O'Neill
On 8/5/24 15:29, Patrick O'Neill wrote: This fixes the remainder of the typos I found when reading various parts of the RISC-V backend. gcc/ChangeLog: * config/riscv/riscv-v.cc (legitimize_move): extrac -> extract. (expand_vec_cmp_float): Remove duplicate vmnor.mm. * c

Re: [PATCH] c++/coroutines: fix passing *this to promise type, again [PR116327]

2024-08-13 Thread Iain Sandoe
Hi Patrick > On 13 Aug 2024, at 03:01, Patrick Palka wrote: > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk/14? > > -- >8 -- > > In r15-2210 we got rid of the unnecessary cast to lvalue reference when > passing *this to the promise type ctor, and as a drive-by change we also > s

[PATCH] PR tree-optimization/101390: Vectorize modulo operator

2024-08-13 Thread Jennifer Schmitz
This patch adds a new vectorization pattern that detects the modulo operation where the second operand is a variable. It replaces the statement by division, multiplication, and subtraction. The patch was bootstrapped and regtested on aarch64-linux-gnu, no regression. Ok for mainline? Signed-off-b

[Fortran, Patch, PR102973, v1] Reset flag for parsing proc_ptrs in associate in error case

2024-08-13 Thread Andre Vehreschild
Hi all, attached patch is the last one the meta-bug 87477 ASSOCIATE depends on. The resolution was already given in the PR, so I just beautified it and made patch for it. I tried to come up with a testcase as well as Harald has, but had no luck with it. I see less harm in reseting the flag in the

[PATCH V2 10/10] autovectorizer: Test autovectorization of different dot-prod modes.

2024-08-13 Thread Victor Do Nascimento
From: Victor Do Nascimento Given the novel treatment of the dot product optab as a conversion, we are now able to targe different relationships between output modes and input modes. This is made clearer by way of example. Previously, on AArch64, the following loop was vectorizable: uint32_t udo

Re: [PATCH, gfortran] libgfortran: implement fpu-macppc for Darwin, support IEEE arithmetic

2024-08-13 Thread Sergey Fedorov
On Mon, Aug 5, 2024 at 6:25 PM Sergey Fedorov wrote: > > On Thu, Jul 25, 2024 at 4:47 PM FX Coudert wrote: > >> Can you post an updated version of the patch, following the first round >> of review? >> >> FX > > If you got some time, please review this. I will likely be away from my PowerPC har

[PATCH V2 07/10] mips: Adjust dot-product backend patterns

2024-08-13 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/mips/loongson-mmi.md (sdot_prodv4hi): Renamed to... (sdot_prodv2siv4hi): ...this. --

[PATCH V2 08/10] rs6000: Adjust altivec dot-product backend patterns

2024-08-13 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/rs6000/altivec.md (udot_prod): Renamed to... (udot_prodv4si): ...this. (sdot

[PATCH V2 05/10] i386: Fix dot_prod backend patterns for mmx and sse targets

2024-08-13 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/i386/mmx.md (usdot_prodv8qi): Renamed to... (usdot_prodv2siv8qi): ...this. (

[PATCH V2 00/10] optabs: Make all `*dot_prod_optab's modeled as conversions

2024-08-13 Thread Victor Do Nascimento
Changes in this revision: * Remove features that classified as feature creep (Gimple folding and rewriting the aarch64/arm dotprod builtin initialization routines). These will be submitted separately later. * Add missing second mode to arm-backend pattern missed in original. * Add implementation f

[PATCH V2 04/10] arm: Fix arm backend-use of (u|s|us)dot_prod patterns

2024-08-13 Thread Victor Do Nascimento
gcc/ChangeLog: * config/arm/arm-builtins.cc (enum arm_builtins): Add new ARM_BUILTIN_* enum values: SDOTV8QI, SDOTV16QI, UDOTV8QI, UDOTV16QI, USDOTV8QI, USDOTV16QI. (arm_init_dotprod_builtins): New. (arm_init_builtins): Add call to `arm_init_dotprod_builtins

[PATCH V2 06/10] arc: Adjust dot-product backend patterns

2024-08-13 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/arc/simdext.md (sdot_prodv2hi): Renamed to... (sdot_prodsiv2hi): ...this. (u

[PATCH V2 09/10] c6x: Adjust dot-product backend patterns

2024-08-13 Thread Victor Do Nascimento
Following the migration of the dot_prod optab from a direct to a conversion-type optab, ensure all back-end patterns incorporate the second machine mode into pattern names. gcc/ChangeLog: * config/c6x/c6x.md (sdot_prodv2hi): Renamed to... (sdot_prodsiv2hi): ...this. --- gcc/confi

[PATCH V2 03/10] aarch64: Fix aarch64 backend-use of (u|s|us)dot_prod patterns

2024-08-13 Thread Victor Do Nascimento
Given recent changes to the dot_prod standard pattern name, this patch fixes the aarch64 back-end by implementing the following changes: 1. Add 2nd mode to all (u|s|us)dot_prod patterns in .md files. 2. Rewrite initialization and function expansion mechanism for simd builtins. 3. Fix all direct ca

[PATCH V2 01/10] optabs: Make all `*dot_prod_optab's modeled as conversions

2024-08-13 Thread Victor Do Nascimento
Given the specification in the GCC internals manual defines the {u|s}dot_prod standard name as taking "two signed elements of the same mode, adding them to a third operand of wider mode", there is currently ambiguity in the relationship between the mode of the first two arguments and that of the th

  1   2   >