[PATCH] i386: Enable V2BF/V4BF vec_cmp with AVX10.2 vcmppbf16

2024-09-09 Thread Levy Hsu
gcc/ChangeLog: * config/i386/i386.cc (ix86_get_mask_mode): Enable BFmode for targetm.vectorize.get_mask_mode with AVX10.2. * config/i386/mmx.md (vec_cmpqi): Implement vec_cmpv2bfqi and vec_cmpv4bfqi. gcc/testsuite/ChangeLog: * gcc.target/i386/part-vect-vec

[PATCH v2 2/2] RISC-V: Fix ICE due to inconsistency of RVV intrinsic list in lto and cc1.

2024-09-09 Thread Jin Ma
When we use flto, the function list of rvv will be generated twice, once in the cc1 phase and once in the lto phase. However, due to the different generation methods, the two lists are different. For example, when there is no zvfh or zvfhmin in arch, it is generated by calling function "riscv_prag

[PATCH v2 1/2] RISC-V: Fix ICE caused by early ggc_free on DECL for RVV intrinsics in LTO.

2024-09-09 Thread Jin Ma
gcc/ChangeLog: * config/riscv/riscv-vector-builtins.cc (function_builder::add_function): Check the final DECl to make sure it is valid. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/base/bug-10.c: New test. --- gcc/config/riscv/riscv-vector-builtins.cc | 9

RFC model schedule tweak (was Re: sched1 pathology on RISC-V : PR/114729)

2024-09-09 Thread Vineet Gupta
On 8/27/24 18:10, Vineet Gupta wrote: > On 8/7/24 10:47, Richard Sandiford wrote: >> is probably not appropriate. We should probably just use the baseECC, >> as suggested by the first sentence in the comment. It looks like the hack: >> >> diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc >> in

Re: [PATCH 2/2] phiopt: Move the common code between pass_phiopt and pass_cselim into a seperate function

2024-09-09 Thread Richard Biener
> Am 10.09.2024 um 05:41 schrieb Andrew Pinski : > > When r14-303-gb9fedabe381cce was done, it was missed that some of the common > parts could > be done in a template and a lambda could be used. This patch implements that. > This new > function can be used later on to implement a simple ifc

Re: [PATCH 1/2] phiopt: Use gimple_phi_result rather than PHI_RESULT [PR116643]

2024-09-09 Thread Richard Biener
> Am 10.09.2024 um 05:41 schrieb Andrew Pinski : > > This converts the uses of PHI_RESULT in phiopt to be gimple_phi_result > instead. Since there was already a mismatch of uses here, it > would be good to use prefered one (gimple_phi_result) instead. > > Bootstrapped and tested on x86_64-lin

Re: [PATCH 2/2] phiopt: Move the common code between pass_phiopt and pass_cselim into a seperate function

2024-09-09 Thread Richard Biener
> Am 10.09.2024 um 05:41 schrieb Andrew Pinski : > > When r14-303-gb9fedabe381cce was done, it was missed that some of the common > parts could > be done in a template and a lambda could be used. This patch implements that. > This new > function can be used later on to implement a simple ifc

Re: [PATCH 1/2] phiopt: Use gimple_phi_result rather than PHI_RESULT [PR116643]

2024-09-09 Thread Richard Biener
> Am 10.09.2024 um 05:41 schrieb Andrew Pinski : > > This converts the uses of PHI_RESULT in phiopt to be gimple_phi_result > instead. Since there was already a mismatch of uses here, it > would be good to use prefered one (gimple_phi_result) instead. > > Bootstrapped and tested on x86_64-lin

[PATCH RFA] libstdc++: fix C header include guards

2024-09-09 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, OK for trunk? -- 8< -- Ever since the c_global and c_compatibility directories were added in r122533, the include guards have been oddly late in the files, with no comment about why that might be either in the commit message or the files themselves. I don't see any ju

[PATCH 1/2] phiopt: Use gimple_phi_result rather than PHI_RESULT [PR116643]

2024-09-09 Thread Andrew Pinski
This converts the uses of PHI_RESULT in phiopt to be gimple_phi_result instead. Since there was already a mismatch of uses here, it would be good to use prefered one (gimple_phi_result) instead. Bootstrapped and tested on x86_64-linux-gnu. PR tree-optimization/116643 gcc/ChangeLog:

[PATCH 2/2] phiopt: Move the common code between pass_phiopt and pass_cselim into a seperate function

2024-09-09 Thread Andrew Pinski
When r14-303-gb9fedabe381cce was done, it was missed that some of the common parts could be done in a template and a lambda could be used. This patch implements that. This new function can be used later on to implement a simple ifcvt pass. gcc/ChangeLog: * tree-ssa-phiopt.cc (execute_ov

Re: [PING^4] [PATCH] Add a bootstrap-native build config

2024-09-09 Thread Ramana Radhakrishnan
> On 9 Sep 2024, at 10:34 PM, Andi Kleen wrote: > > External email: Use caution opening links or attachments > > > Andi Kleen writes: > > Ping^4 > > Could someone please approve this (nearly trivial) patch? > > Thanks, > -Andi > >> Andi Kleen writes: >> >> Ping^3 >> >>> Andi Kleen wr

[pushed: r15-3556] diagnostics: introduce struct diagnostic_option_id

2024-09-09 Thread David Malcolm
Use a new struct diagnostic_option_id rather than just "int" when referring to command-line options controlling warnings in the diagnostic subsystem. No functional change intended, but better documents the meaning of the code. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed

[pushed: r15-3555] diagnostics: replace option_hooks with a diagnostic_option_manager class

2024-09-09 Thread David Malcolm
Introduce a diagnostic_option_manager class to help isolate the diagnostics subsystem from GCC's option handling. No functional change intended. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r15-3555-ga97448e92eb76a. gcc/ChangeLog: * diagnostic.cc (dia

[pushed: r15-3554] diagnostics: rename dc.printer to m_printer [PR116613]

2024-09-09 Thread David Malcolm
Rename diagnostic_context's "printer" field to "m_printer", for consistency with other fields, and to highlight places where we currently use this, to help assess feasibility of supporting multiple output sinks (PR other/116613). No functional change intended. Successfully bootstrapped & regrtest

[pushed: r15-3553] SARIF output: fix schema URL [§3.13.3, PR116603]

2024-09-09 Thread David Malcolm
We were using https://raw.githubusercontent.com/oasis-tcs/sarif-spec/master/Schemata/sarif-schema-2.1.0.json as the URL for the SARIF 2.1 schema, but this is now a 404. Update it to the URL listed in the spec (§3.13.3 "$schema property"), which is: https://docs.oasis-open.org/sarif/sarif/v2.

RE: [PATCH v1] Match: Support form 2 for scalar signed integer .SAT_ADD

2024-09-09 Thread Li, Pan2
Thanks Richard for comments. >> + The T and UT are type pair like T=int8_t, UT=uint8_t. */ >> +(match (signed_integer_sat_add @0 @1) >> + (cond^ (ge (bit_and:c (bit_xor:c @0 (nop_convert@2 (plus (nop_convert @0) >> + (nop_convert @1 >>

RE: [PATCH v2 1/2] Genmatch: Support control flow graph case 1 for phi on condition

2024-09-09 Thread Li, Pan2
Thanks Richard for comments. > Sorry to spoil this again, but can you instead create an interface like Need mind, let me update it. > gcond * > match_cond_with_phi (gphi *phi, tree *true_arg, tree *false_arg); > That would from a PHI node match up the controlling condition and > initialize {tru

RE: [PATCH] RISC-V: Fix ICE for rvv in lto

2024-09-09 Thread Li, Pan2
> Any comments on this patch? I may need some time to go through all details (PS: Sorry I cannot approve patches, leave it to juzhe or kito). Thanks a lot for fixing this. Pan -Original Message- From: Jin Ma Sent: Monday, September 9, 2024 6:30 PM To: Li, Pan2 ; gcc-patches@gcc.gnu.or

Re: [PATCH v3 08/12] OpenMP: Reject other properties with kind(any)

2024-09-09 Thread Jakub Jelinek
On Mon, Sep 09, 2024 at 02:55:25PM -0600, Sandra Loosemore wrote: > On 9/9/24 05:01, Jakub Jelinek wrote: > > > > I think also testing the device={kind(any,any)} and device={kind("any",any)} > > and device={kind(any,"any"))} would be useful. > > Hmmm, it looks like GCC does not presently check fo

Re: [PATCH v3 08/12] OpenMP: Reject other properties with kind(any)

2024-09-09 Thread Sandra Loosemore
On 9/9/24 05:01, Jakub Jelinek wrote: I think also testing the device={kind(any,any)} and device={kind("any",any)} and device={kind(any,"any"))} would be useful. Hmmm, it looks like GCC does not presently check for the restriction "Each trait-property may only be specified once in a trait sel

Re: [PATCH] Make 'target-supports.exp' additions for nvptx target generally available

2024-09-09 Thread Mike Stump
Ok. Though, some of these files are so littered with target bits that essentially it doesn't make too much a difference. On Jul 18, 2024, at 4:44 AM, Thomas Schwinge wrote: > > OK to push (once testing completes) the attached > "Make 'target-supports.exp' additions for nvptx target generally a

[committed]: i386: Use offsettable address constraint for double-word memory operands

2024-09-09 Thread Uros Bizjak
Double-word memory operands are accessed as their high and low parts, so the memory location has to be offsettable. Use "o" constraint instead of "m" for double-word memory operands. gcc/ChangeLog: * config/i386/i386.md (*insvdi_lowpart_1): Use "o" constraint instead of "m" for double-wo

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-09-09 Thread Bill Wendling
On Sun, Sep 8, 2024 at 3:07 AM Martin Uecker wrote: > > Am Sonntag, dem 08.09.2024 um 02:09 -0700 schrieb Bill Wendling: > > On Fri, Sep 6, 2024 at 10:50 PM Martin Uecker wrote: > > > > > > Am Freitag, dem 06.09.2024 um 13:59 -0700 schrieb Bill Wendling: > > > > On Fri, Sep 6, 2024 at 12:32 PM Ma

[pushed: r15-3551] analyzer: fix "unused variable 'summary_cast_reg'" warning

2024-09-09 Thread David Malcolm
I missed this in r15-1108-g70f26314b62e2d. Successfully bootstrapped on x86_64-pc-linux-gnu. Pushed as r15-3551-g6e35b0e8572a71. gcc/analyzer/ChangeLog: * call-summary.cc (call_summary_replay::convert_region_from_summary_1): Drop unused local "summary_cast_reg" Signed-of

Re: [patch, fortran] Matmul and dot_product for unsigned

2024-09-09 Thread Thomas Koenig
Am 09.09.24 um 20:01 schrieb Richard Biener: But it will require some ugly m4 hackery... I'll take a look if I can make it work. > I meant you shouldn’t need new library entry points for unsigned > but simply call the signed ones (and switch the signed implementation > to use unsigned arithmet

Re: [patch, fortran] Matmul and dot_product for unsigned

2024-09-09 Thread Richard Biener
> Am 09.09.2024 um 19:09 schrieb Thomas Koenig : > > Am 09.09.24 um 09:19 schrieb Richard Biener: >> Is the library implementation in any way different from the signed >> one? Iff only >> multiplication and addition/subtraction are involved the unsigned >> implementation >> could implement b

Re: [patch, fortran] Matmul and dot_product for unsigned

2024-09-09 Thread Thomas Koenig
Am 09.09.24 um 09:19 schrieb Richard Biener: Is the library implementation in any way different from the signed one? Iff only multiplication and addition/subtraction are involved the unsigned implementation could implement both variants (the signed one would eventually cause undefinedness with r

[PING^4] [PATCH] Add a bootstrap-native build config

2024-09-09 Thread Andi Kleen
Andi Kleen writes: Ping^4 Could someone please approve this (nearly trivial) patch? Thanks, -Andi > Andi Kleen writes: > > Ping^3 > >> Andi Kleen writes: >> >> PING^2 for the patch. >> >> (not sure if there is any maintainer to cc here, this is generic build >> infrastructure) >> >>> Andi K

Re: [PATCH] c++: ICE with -Wtautological-compare in template [PR116534]

2024-09-09 Thread Marek Polacek
Ping. On Thu, Aug 29, 2024 at 12:23:35PM -0400, Marek Polacek wrote: > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14? > > -- >8 -- > Pre r14-4793, we'd call warn_tautological_cmp -> operand_equal_p > with operands wrapped in NON_DEPENDENT_EXPR, which works, since > o_e_p bails fo

Re: [PATCH] c++: mutable temps in rodata [PR116369]

2024-09-09 Thread Marek Polacek
Ping. On Thu, Aug 29, 2024 at 04:15:41PM -0400, Marek Polacek wrote: > Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk/14/13? > > -- >8 -- > Here we wrongly mark the reference temporary for g TREE_READONLY, > so it's put in .rodata and so we can't modify its subobject even > when the

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-09-09 Thread Qing Zhao
> On Sep 9, 2024, at 10:20, Jakub Jelinek wrote: > > On Mon, Sep 09, 2024 at 02:10:05PM +, Qing Zhao wrote: >> Okay, now after finishing reading all the discussion so far, I realized that >> we are back to the previous pointer approach: >> >> __builtin_get_counted_by (p->FAM) >> >> Works

RE: [nvptx] Pass -m32/-m64 to host_compiler if it has multilib support

2024-09-09 Thread Thomas Schwinge
Hi Prathamesh! On 2024-09-09T06:31:18+, Prathamesh Kulkarni wrote: >> -Original Message- >> From: Thomas Schwinge >> Sent: Friday, September 6, 2024 2:31 PM >> On 2024-08-16T15:36:29+, Prathamesh Kulkarni >> wrote: >> >> > Am 13.08.2024 um 17:48 schrieb Thomas Schwinge >> >> : >

[committed] hppa: Don't canonicalize operand order of scaled index addresses

2024-09-09 Thread John David Anglin
Tested on hppa64-hp-hpux11.11 and hppa-unknown-linux-gnu. Committed to trunk. Dave --- hppa: Don't canonicalize operand order of scaled index addresses pa_print_operand handles both operand orders for scaled index addresses, so it isn't necessary to canonicalize the order of operands. 2024-09-

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-09-09 Thread Jakub Jelinek
On Mon, Sep 09, 2024 at 02:10:05PM +, Qing Zhao wrote: > Okay, now after finishing reading all the discussion so far, I realized that > we are back to the previous pointer approach: > > __builtin_get_counted_by (p->FAM) > > Works as: > > If (has_counted_by (p->FAM)) > return &p->COUNT; >

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-09-09 Thread Qing Zhao
Okay, now after finishing reading all the discussion so far, I realized that we are back to the previous pointer approach: __builtin_get_counted_by (p->FAM) Works as: If (has_counted_by (p->FAM)) return &p->COUNT; else return (void *)0; Then the user will use it as: auto p = __builtin_get

RE: Re-compute TYPE_MODE and DECL_MODE while streaming in for accelerator

2024-09-09 Thread Richard Biener
On Tue, 3 Sep 2024, Prathamesh Kulkarni wrote: > > > > -Original Message- > > From: Prathamesh Kulkarni > > Sent: Thursday, August 22, 2024 7:41 PM > > To: Richard Biener > > Cc: Richard Sandiford ; Thomas Schwinge > > ; gcc-patches@gcc.gnu.org > > Subject: RE: Re-compute TYPE_MODE and

Re: [Bug tree-optimization/109429] [PATCH] ivopts: fixed complexities

2024-09-09 Thread Richard Biener
On Wed, Sep 4, 2024 at 4:01 PM Aleksandar Rakic wrote: > > From 0130d3cb01fd9d5c1c997003245ed57bbdeb00a2 Mon Sep 17 00:00:00 2001 > From: Aleksandar > Date: Fri, 23 Aug 2024 11:36:50 +0200 > Subject: [PATCH] [Bug tree-optimization/109429] ivopts: fixed complexities > > This patch addresses a bug

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-09-09 Thread Qing Zhao
> On Sep 7, 2024, at 02:16, Martin Uecker wrote: > > Am Samstag, dem 07.09.2024 um 00:12 + schrieb Qing Zhao: >> Now, if >> >> 1. __builtin_get_counted_by should return a LVALUE instead of a pointer >> (required by CLANG’s design) >> And >> 2. It’s better not to change the behavior of __b

Re: [PATCH] tree-optimization/116514 - handle pointer difference in bit-CCP

2024-09-09 Thread Richard Biener
On Wed, 28 Aug 2024, Richard Biener wrote: > When evaluating the difference of two aligned pointers in CCP we > fail to handle the EXACT_DIV_EXPR by the element size that occurs. > The testcase then also exercises modulo to test alignment but > modulo by a power-of-two isn't handled either. > > R

Re: [PATCH v4 2/2] arm: [MVE intrinsics] Improve vdupq_n implementation

2024-09-09 Thread Christophe Lyon
ping? On Tue, 30 Jul 2024 at 23:41, Christophe Lyon wrote: > > Hi, > > v4 of patch 2/2 fixes a small mistake in 3 testcases, by relaxing the > expected q0 as result register into q[0-9]+ to account for codegen > differences depending on if the test is compiled with > -mfloat-abi=softfp or -mfloat

Re: [PATCH v1] Vect: Support form 1 of vector signed integer .SAT_ADD

2024-09-09 Thread Richard Biener
On Fri, Aug 30, 2024 at 12:16 PM wrote: > > From: Pan Li > > This patch would like to support the vector signed ssadd pattern > for the RISC-V backend. Aka > > Form 1: > #define DEF_VEC_SAT_S_ADD_FMT_1(T, UT, MIN, MAX) \ > void __attribute__((noinline))

Re: [PATCH v2 1/2] Genmatch: Support control flow graph case 1 for phi on condition

2024-09-09 Thread Richard Biener
On Thu, Sep 5, 2024 at 2:01 PM wrote: > > From: Pan Li > > The gen_phi_on_cond can only support below control flow for cond > from day 1. Aka: > > +--+ > | def | > | ... | +-+ > | cond |-->| def | > +--+ | ... | >| +-+ >| | >v

Re: [PATCH v1] Match: Support form 2 for scalar signed integer .SAT_ADD

2024-09-09 Thread Richard Biener
On Tue, Sep 3, 2024 at 2:34 PM wrote: > > From: Pan Li > > This patch would like to support the form 2 of the scalar signed > integer .SAT_ADD. Aka below example: > > Form 2: > #define DEF_SAT_S_ADD_FMT_2(T, UT, MIN, MAX) \ > T __attribute__((noinline)) \ > sat_s_add_##T##_fmt

Re: [PATCH v1 2/2] Match: Add int type fits check for form 2 of .SAT_SUB imm operand

2024-09-09 Thread Richard Biener
On Mon, Sep 2, 2024 at 7:53 AM wrote: > > From: Pan Li > > This patch would like to add strict check for imm operand of .SAT_SUB > matching. We have no type checking for imm operand in previous, which > may result in unexpected IL to be catched by .SAT_SUB pattern. > > We leverage the int_fits_t

[PATCH] RISC-V: Implement TARGET_CAN_INLINE_P

2024-09-09 Thread Yangyu Chen
Currently, we lack support for TARGET_CAN_INLINE_P on the RISC-V ISA. As a result, certain functions cannot be optimized with inlining when specific options, such as __attribute__((target("arch=+v"))) . This can lead to potential performance issues when building retargetable binaries for RISC-V. T

Re: [PATCH] middle-end: also optimized `popcount(a) <= 1` [PR90693]

2024-09-09 Thread Richard Biener
On Fri, Aug 30, 2024 at 2:09 AM Andrew Pinski wrote: > > This expands on optimizing `popcount(a) == 1` to also handle > `popcount(a) <= 1`. `<= 1` can be expanded as `(a & -a) == 0` > like what is done for `== 1` if we know that a was nonzero. > We have to do the optimization in 2 places due to if

Re: [PATCH v3] match: Fix A || B not optimized to true when !B implies A [PR114326]

2024-09-09 Thread Richard Biener
On Thu, Aug 29, 2024 at 9:03 AM wrote: > > From: kelefth > > In expressions like (a != b || ((a ^ b) & c) == d) and > (a != b || (a ^ b) == c), (a ^ b) is folded to false. > In the equivalent expressions (((a ^ b) & c) == d || a != b) and > ((a ^ b) == c || a != b) this is not happening. > > This

Re: [PATCH] vect: Do not try to duplicate_and_interleave one-element mode.

2024-09-09 Thread Richard Sandiford
Richard Biener writes: >> Am 06.09.2024 um 16:05 schrieb Robin Dapp : >> >> Hi, >> >> PR112694 shows that we try to create sub-vectors of single-element >> vectors because can_duplicate_and_interleave_p returns true. > > Can we avoid querying the function? CCing Richard who should know more ab

Re: [PATCH v3 08/12] OpenMP: Reject other properties with kind(any)

2024-09-09 Thread Jakub Jelinek
On Sun, Sep 08, 2024 at 09:15:23AM -0600, Sandra Loosemore wrote: > On 8/16/24 06:58, Jakub Jelinek wrote: > > > > If this can apply (perhaps with small fuzz) to vanilla trunk, guess it can > > be committed right now, doesn't need to wait for the rest of the > > metadirective patch set. > > OK.

[PATCH] tree-optimization/116647 - wrong classified double reduction

2024-09-09 Thread Richard Biener
The following avoids classifying a double reduction that's not actually a reduction in the outer loop (because its value isn't used outside of the outer loop). This avoids us ICEing on the unexpected stmt/SLP node arrangement. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

Re: [PATCH v3 03/12] libgomp: runtime support for target_device selector

2024-09-09 Thread Tobias Burnus
Hi all, Jakub Jelinek wrote: On Sat, Jul 20, 2024 at 02:42:22PM -0600, Sandra Loosemore wrote: This patch implements the libgomp runtime support for the dynamic target_device selector via the GOMP_evaluate_target_device function. […] Now for kind, isa and arch traits in the target_device set

Re: [PATCH] RISC-V: Fix ICE for rvv in lto

2024-09-09 Thread Jin Ma
> I see, I can reproduce this when build "-march=rv64gcv -mabi=lp64d -flto -O0 > test.c -o test.elf". > > #include > > int > main () > { > size_t vl = 8; > vint32m1_t vs1 = {}; > vint32m1_t vs2 = {}; > vint32m1_t vd = __riscv_vadd_vv_i32m1(vs1, vs2, vl); > > return (int)&vd; > } > >

Re: [patch,reload,v3] PR116326 Introduce RELOAD_ELIMINABLE_REGS + docs

2024-09-09 Thread Georg-Johann Lay
Am 09.09.24 um 09:08 schrieb Richard Biener: On Sun, Sep 8, 2024 at 12:22 PM Georg-Johann Lay wrote: The reason for PR116326 is that LRA and reload require different ELIMINABLE_REGS for a multi-register frame pointer. As ELIMINABLE_REGS is used to initialize static const objects, it is not po

Re: [PATCH 4/4]AArch64: Define VECTOR_STORE_FLAG_VALUE.

2024-09-09 Thread Richard Sandiford
Kyrylo Tkachov writes: >> On 9 Sep 2024, at 11:06, Tamar Christina wrote: >> >> External email: Use caution opening links or attachments >> >> >>> -Original Message- >>> From: Richard Sandiford >>> Sent: Monday, September 9, 2024 9:29 AM >>> To: Tamar Christina >>> Cc: gcc-patches@gc

Re: [PATCH 4/4]AArch64: Define VECTOR_STORE_FLAG_VALUE.

2024-09-09 Thread Kyrylo Tkachov
> On 9 Sep 2024, at 11:06, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > >> -Original Message- >> From: Richard Sandiford >> Sent: Monday, September 9, 2024 9:29 AM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnsha

[PATCH v3] RISC-V: Fixed incorrect semantic description in DF to DI pattern in the Zfa extension on rv32.

2024-09-09 Thread Jin Ma
gcc/ChangeLog: * config/riscv/riscv.md: Change "truncate" to unspec for the Zfa extension on rv32. gcc/testsuite/ChangeLog: * gcc.target/riscv/zfa-fmovh-fmovp-bug.c: New test. --- gcc/config/riscv/riscv.md| 16 +--- .../gcc.target/riscv/zfa-f

RE: [PATCH 4/4]AArch64: Define VECTOR_STORE_FLAG_VALUE.

2024-09-09 Thread Tamar Christina
> -Original Message- > From: Richard Sandiford > Sent: Monday, September 9, 2024 9:29 AM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; ktkac...@gcc.gnu.org > Subject: Re: [PATCH 4/4]AArch64: Define VECTOR_STORE_FLAG_VALUE. > > Tamar C

Re: [PATCH] SVE intrinsics: Fold svdiv with all-zero operands to zero vector

2024-09-09 Thread Richard Sandiford
Jennifer Schmitz writes: > This patch folds svdiv where one of the operands is all-zeros to a zero > vector, if the predicate is ptrue or the predication is _x or _z. > This case was not covered by the recent patch that implemented constant > folding, because that covered only cases where both ope

Re: [PATCH 4/4]AArch64: Define VECTOR_STORE_FLAG_VALUE.

2024-09-09 Thread Richard Sandiford
Tamar Christina writes: > Hi All, > > This defines VECTOR_STORE_FLAG_VALUE to CONST1_RTX for AArch64 > so we simplify vector comparisons in AArch64. > > With this enabled > > res: > moviv0.4s, 0 > cmeqv0.4s, v0.4s, v0.4s > ret > > is simplified to: > > res: >

Re: [PATCH] RISC-V: Fix ICE for rvv in lto

2024-09-09 Thread Jin Ma
> > I think this is wrong and needs to be fixed, maybe we shouldn't > > use "ggc_alloc ()", or is there another better > > way to implement it? > > From the root we're marking the registered_functions vector via > the > > template > void > gt_ggc_mx (vec *v) > > overload which will eventually ma

Re: [PATCH] gimple-fold: Move optimizing memcpy to memset to fold_stmt from fab

2024-09-09 Thread Richard Biener
On Sat, Sep 7, 2024 at 1:31 AM Andrew Pinski wrote: > > I noticed this folding inside fab could be done else where and could > even improve inlining decisions and a few other things so let's > move it to fold_stmt. > It also fixes PR 116601 because places which call fold_stmt already > have to dea

[committed] testsuite: Fix up pr116588.c test [PR116588]

2024-09-09 Thread Jakub Jelinek
On Sat, Sep 07, 2024 at 01:58:46PM -0400, Andrew MacLeod wrote: The test as committed without the tree-vrp.cc change only FAILs with FAIL: gcc.dg/pr116588.c scan-tree-dump-not vrp2 "0 != 0" The DEBUG code in there was just to make it easier to debug, but doesn't actually fail when the test is misco

[PATCH] Amend gcc.dg/vect/fast-math-vect-call-2.c

2024-09-09 Thread Richard Biener
There was a reported regression on x86-64 with -march=cascadelake and -m32 where epilogue vectorization causes a different number of SLPed loops. Fixed by disabling epilogue vectorization for the testcase. tested on x86_64-unknown-linux-gnu, pushed. * gcc.dg/vect/fast-math-vect-call-2.c:

Re: Match: Fix ordered and nonequal: Fix 'gcc.dg/opt-ordered-and-nonequal-1.c' re 'LOGICAL_OP_NON_SHORT_CIRCUIT' [PR116635] (was: [PATCH] Match: Fix ordered and nonequal)

2024-09-09 Thread Richard Biener
On Mon, Sep 9, 2024 at 8:48 AM Thomas Schwinge wrote: > > Hi! > > On 2024-09-04T13:43:45+0800, "Hu, Lin1" wrote: > > --- /dev/null > > +++ b/gcc/testsuite/gcc.dg/opt-ordered-and-nonequal-1.c > > @@ -0,0 +1,49 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2 -fdump-tree-forwprop1-details"

Re: [patch, fortran] Matmul and dot_product for unsigned

2024-09-09 Thread Richard Biener
On Sun, Sep 8, 2024 at 10:32 PM Thomas Koenig wrote: > > Hello world, > > like the subject says. The patch is gzipped because it is large; > it contains multiple MATMUL library implementations. > > OK for trunk? > > Implement MATMUL and DOT_PRODUCT for unsgigned. Is the library implementation in

Re: [PATCH] phiopt: Small refactoring/cleanup of non-ssa name case of factor_out_conditional_operation

2024-09-09 Thread Richard Biener
On Sun, Sep 8, 2024 at 6:20 PM Andrew Pinski wrote: > > This small cleanup removes a redundant check for gimple_assign_cast_p and > reformats > based on that. Also changes the if statement that checks if the integral type > and the > check to see if the constant fits into the new type such that

Re: [PATCH] RISC-V: Fix ICE for rvv in lto

2024-09-09 Thread Richard Biener
On Fri, Sep 6, 2024 at 7:31 PM Jin Ma wrote: > > When we use flto, the function list of rvv will be generated twice, > once in the cc1 phase and once in the lto phase. However, due to > the different generation methods, the two lists are different. > > For example, when there is no zvfh or zvfhmin

Re: [patch,reload,v2] PR116326 Introduce RELOAD_ELIMINABLE_REGS

2024-09-09 Thread Richard Biener
On Sun, Sep 8, 2024 at 12:22 PM Georg-Johann Lay wrote: > > The reason for PR116326 is that LRA and reload require different > ELIMINABLE_REGS for a multi-register frame pointer. As ELIMINABLE_REGS > is used to initialize static const objects, it is not possible to make > ELIMINABLE_REGS dependen