Re: [PATCH 2/2] RISC-V: Add cmpmemsi expansion

2024-05-14 Thread Christoph Müllner
On Thu, May 9, 2024 at 4:50 PM Jeff Law wrote: > > > > On 5/7/24 11:52 PM, Christoph Müllner wrote: > > GCC has a generic cmpmemsi expansion via the by-pieces framework, > > which shows some room for target-specific optimizations. > > E.g. for comparing two aligned memory blocks of 15 bytes > > we

[PATCH v2 2/2] RISC-V: strcmp expansion: Use adjust_address() for address calculation

2024-05-14 Thread Christoph Müllner
We have an arch-independent routine to generate an address with an offset. Let's use that instead of doing the calculation in the backend. gcc/ChangeLog: * config/riscv/riscv-string.cc (emit_strcmp_scalar_load_and_compare): Use adjust_address() to calculate MEM-PLUS pattern. Sign

[PATCH v2 1/2] RISC-V: Add cmpmemsi expansion

2024-05-14 Thread Christoph Müllner
GCC has a generic cmpmemsi expansion via the by-pieces framework, which shows some room for target-specific optimizations. E.g. for comparing two aligned memory blocks of 15 bytes we get the following sequence: my_mem_cmp_aligned_15: li a4,0 j .L2 .L8: bgeua4

[PATCH] RISC-V: Fix cbo.zero expansion for rv32

2024-05-14 Thread Christoph Müllner
Emitting a DI pattern won't find a match for rv32 and manifests in the failing test case gcc.target/riscv/cmo-zicboz-zic64-1.c. Let's fix this in the expansion and also address the different code that gets generated for rv32/rv64. gcc/ChangeLog: * config/riscv/riscv-string.cc (riscv_expan

RE: [PATCH 0/2] Support APX zero-upper

2024-05-14 Thread Cui, Lili
Sorry, please ignore these patches. Regards, Lili. > -Original Message- > From: Cui, Lili > Sent: Wednesday, May 15, 2024 2:24 PM > To: gcc-patches@gcc.gnu.org > Cc: Liu, Hongtao ; ubiz...@gmail.com > Subject: [PATCH 0/2] Support APX zero-upper > > > A bug was found when adding operand

[PATCH 2/2] Support APX zero-upper

2024-05-14 Thread Cui, Lili
gas/ChangeLog: * config/tc-i386.c (build_apx_evex_prefix): Handle ZU. * testsuite/gas/i386/x86-64.exp: Added new tests for ZU. * testsuite/gas/i386/x86-64.exp: Added new tests for ZU. * testsuite/gas/i386/x86-64-apx-zu-intel.d: New test. * testsuite/gas/i386

[PATCH 1/2] Add check for 8-bit old registers in EVEX format

2024-05-14 Thread Cui, Lili
gas/ChangeLog: * config/tc-i386.c (md_assemble): Add invalid check for old byte registers in EVEX/VEX format. * testsuite/gas/i386/x86-64-apx-inval.l: Add new test. * testsuite/gas/i386/x86-64-apx-inval.s: Ditto. --- gas/config/tc-i386.c | 12 +

[PATCH 0/2] Support APX zero-upper

2024-05-14 Thread Cui, Lili
A bug was found when adding operand %ah to an invalid test case, so patch 1/2 was added to fix it. And made the following changes to the old patch. 1. Removed two redundant judgment codes in zu. 2. Added various types of register sizes in invalid test cases (found a bug about AH/BH/CH/DH). 3.

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Richard Biener
On Tue, 14 May 2024, Qing Zhao wrote: > > > > On May 14, 2024, at 13:14, Richard Biener wrote: > > > > On Tue, 14 May 2024, Qing Zhao wrote: > > > >> > >> > >>> On May 14, 2024, at 10:29, Richard Biener wrote: > >>> > > [...] > >>> It would of course > >>> need experimenting since we can

Re: [PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-14 Thread HAO CHEN GUI
Hi Andrew, Thanks so much for your explanation. I got it. I will address the issue. Thanks Gui Haochen 在 2024/5/15 2:45, Andrew MacLeod 写道: > > On 5/9/24 04:47, HAO CHEN GUI wrote: >> Hi Mikael, >> >>    Thanks for your comments. >> >> 在 2024/5/9 16:03, Mikael Morin 写道: >>> I think the canonic

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Richard Biener
On Tue, 14 May 2024, Kees Cook wrote: > On Tue, May 14, 2024 at 02:17:16PM +, Qing Zhao wrote: > > The current major issue with the warning is: the constant index value 4 > > is not in the source code, it’s a compiler generated intermediate value > > (even though it’s a correct value -:)). Su

Re: [PATCH] RISC-V: Add Zvfbfwma extension to the -march= option

2024-05-14 Thread Kito Cheng
LGTM, I agree we should only implement what Embedded Processor implies, we have no way to know that from the arch string On Wed, May 15, 2024 at 1:35 PM Xiao Zeng wrote: > > This patch would like to add new sub extension (aka Zvfbfwma) to the > -march= option. It introduces a new data type BF

[PATCH] RISC-V: Add Zvfbfwma extension to the -march= option

2024-05-14 Thread Xiao Zeng
This patch would like to add new sub extension (aka Zvfbfwma) to the -march= option. It introduces a new data type BF16. 1 In spec: "Zvfbfwma requires the Zvfbfmin extension and the Zfbfmin extension." 1.1 In EmbeddedProcessor: Zvfbfwma -> Zvfbfmin -> Zve32f 1.2 In Application Processor: Z

Re: [PATCH 1/2] RISC-V: Add tests for cpymemsi expansion

2024-05-14 Thread Christoph Müllner
On Fri, May 10, 2024 at 6:01 AM Patrick O'Neill wrote: > > Hi Christoph, > > cpymemsi-1.c fails on a subset of newlib targets. > > "UNRESOLVED: gcc.target/riscv/cpymemsi-1.c -O0 compilation failed to > produce executable" > > Full list of failing targets here (New Failures section): > https://g

[committed] Fix rv32 issues with recent zicboz work

2024-05-14 Thread Jeff Law
I should have double-checked the CI system before pushing Christoph's patches for memset-zero. While I thought I'd checked CI state, I must have been looking at the wrong patch from Christoph. Anyway, this fixes the rv32 ICEs and disables one of the tests for rv32. The test would need a revam

RE: [PATCH 0/2] Align tight loops to solve cross cacheline issue

2024-05-14 Thread Jiang, Haochen
Also cc Honza and Richard since we touched generic tune. Thx, Haochen > -Original Message- > From: Haochen Jiang > Sent: Wednesday, May 15, 2024 11:04 AM > To: gcc-patches@gcc.gnu.org > Cc: Liu, Hongtao ; ubiz...@gmail.com > Subject: [PATCH 0/2] Align tight loops to solve cross cacheline

[PATCH 2/2] Align tight&hot loop without considering max skipping bytes.

2024-05-14 Thread Haochen Jiang
From: liuhongt When hot loop is small enough to fix into one cacheline, we should align the loop with ceil_log2 (loop_size) without considering maximum skipp bytes. It will help code prefetch. gcc/ChangeLog: * config/i386/i386.cc (ix86_avoid_jump_mispredicts): Change gen_pad to

[PATCH 1/2] Adjust generic loop alignment from 16:11:8 to 16 for Intel processors

2024-05-14 Thread Haochen Jiang
Previously, we use 16:11:8 in generic tune for Intel processors, which lead to cross cache line issue and result in some random performance penalty in benchmarks with small loops commit to commit. After changing to always aligning to 16 bytes, it will somehow solve the issue. gcc/ChangeLog:

[PATCH 0/2] Align tight loops to solve cross cacheline issue

2024-05-14 Thread Haochen Jiang
Hi all, Recently, we have encountered several random performance regressions in benchmarks commit to commit. It is caused by cross cacheline issue for tight loops. We are trying to solve the issue by two patches. One is adjusting the loop alignment for generic tune, the other is aligning tight an

Re: [PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-14 Thread HAO CHEN GUI
Hi Jakub, Thanks for your review comments. 在 2024/5/14 23:57, Jakub Jelinek 写道: > BUILT_IN_ISFINITE is just one of many BUILT_IN_IS... builtins, > would be nice to handle the others as well. > > E.g. isnormal/isnan/isinf, fpclassify etc. > Yes, I already sent the patches which add range op for

Re: [PATCH] report message for operator %a on unaddressible operand

2024-05-14 Thread Jiufu Guo
Hi, Sorry for missing word "V2". According to previous comments, this version updates: 1. using different 'tests' for the invalid case, and put the msg to a better and safer possition. 2. refine the words of the message. 3. updates the test case a little for comments. BR, Jeff(Jiufu) Guo Jiufu G

[PATCH] report message for operator %a on unaddressible operand

2024-05-14 Thread Jiufu Guo
Hi, For PR96866, when printing asm code for modifier "%a", an addressable operand is required. While the constraint "X" allow any kind of operand even which is hard to get the address directly. e.g. extern symbol whose address is in TOC. An error message would be reported to indicate the invalid

Re: [PATCH] report message for operator %a on unaddressible exp

2024-05-14 Thread Jiufu Guo
Hi, Segher Boessenkool writes: > On Tue, May 14, 2024 at 05:53:56PM +0800, Jiufu Guo wrote: >> Thanks so much for your great review! >> Reference other messages, I'm wondering "invalid %%a value" may be >> acceptable, or "invalid %%a address expression in TOC" maybe better. > > "%%a requires a

RE: [PATCH] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-05-14 Thread Hu, Lin1
> -Original Message- > From: Richard Biener > Sent: Tuesday, May 14, 2024 8:23 PM > To: Hu, Lin1 > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > ubiz...@gmail.com > Subject: RE: [PATCH] vect: generate suitable convert insn for int -> int, > float -> > float and int <-> float. > > On Tue

[PATCH v5 2/3] Vect: Support new IFN SAT_ADD for unsigned vector int

2024-05-14 Thread pan2 . li
From: Pan Li For vectorize, we leverage the existing vect pattern recog to find the pattern similar to scalar and let the vectorizer to perform the rest part for standard name usadd3 in vector mode. The riscv vector backend have insn "Vector Single-Width Saturating Add and Subtract" which can be

Re: [PATCH] Don't reduce estimated unrolled size for innermost loop.

2024-05-14 Thread Hongtao Liu
On Mon, May 13, 2024 at 3:40 PM Richard Biener wrote: > > On Mon, May 13, 2024 at 4:29 AM liuhongt wrote: > > > > As testcase in the PR, O3 cunrolli may prevent vectorization for the > > innermost loop and increase register pressure. > > The patch removes the 1/3 reduction of unr_insn for innermo

[PATCH v5 3/3] RISC-V: Implement IFN SAT_ADD for both the scalar and vector

2024-05-14 Thread pan2 . li
From: Pan Li The patch implement the SAT_ADD in the riscv backend as the sample for both the scalar and vector. Given below vector as example: void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n) { unsigned i; for (i = 0; i < n; i++) out[i] = (x[i] + y[i]) | (- (u

[PATCH v5 1/3] Internal-fn: Support new IFN SAT_ADD for unsigned scalar int

2024-05-14 Thread pan2 . li
From: Pan Li This patch would like to add the middle-end presentation for the saturation add. Aka set the result of add to the max when overflow. It will take the pattern similar as below. SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x)) Take uint8_t as example, we will have: * SAT_AD

Re: [PATCH] RISC-V: Implement -m{,no}fence-tso

2024-05-14 Thread Jeff Law
On 5/14/24 5:13 PM, Palmer Dabbelt wrote: Some processors from T-Head don't implement the `fence.tso` instruction natively and instead trap to firmware. This breaks some users who haven't yet updated the firmware and one could imagine it breaking users who are trying to build firmware if they

[PATCH] [x86] Optimize ashift >> 7 to vpcmpgtb for vector int8.

2024-05-14 Thread liuhongt
Since there is no corresponding instruction, the shift operation for vector int8 is implemented using the instructions for vector int16, but for some special shift counts, it can be transformed into vpcmpgtb. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ready push to trunk. gcc/Chang

Re: [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324)

2024-05-14 Thread Andi Kleen
> > > You need a template testcase; I expect it doesn't work in templates with > > > the > > > current patch. It's probably enough to copy it in tsubst_expr where we > > > currently propagate CALL_EXPR_OPERATOR_SYNTAX. > > > > I tried it with the appended test case, everything seems to work with

[PATCH] RISC-V: Implement -m{,no}fence-tso

2024-05-14 Thread Palmer Dabbelt
Some processors from T-Head don't implement the `fence.tso` instruction natively and instead trap to firmware. This breaks some users who haven't yet updated the firmware and one could imagine it breaking users who are trying to build firmware if they're using the C memory model. So just add an o

Re: Follow up #1 (was Re: [PATCH v2 1/2] RISC-V: avoid LUI based const materialization ... [part of PR/106265])

2024-05-14 Thread Vineet Gupta
On 5/14/24 15:12, Palmer Dabbelt wrote: > On Mon, 13 May 2024 16:08:21 PDT (-0700), Vineet Gupta wrote: >> >> On 5/13/24 15:47, Jeff Law wrote: On 5/13/24 11:49, Vineet Gupta wrote: > 500.perlbench_r-0 | 1,214,534,029,025 | 1,212,887,959,387 | > 500.perlbench_r-1 |740,383,4

Re: [PATCH] report message for operator %a on unaddressible exp

2024-05-14 Thread Jiufu Guo
Hi, Segher Boessenkool writes: > Oh, btw: > > On Tue, May 14, 2024 at 11:00:38AM +0800, Jiufu Guo wrote: >> >> --- a/gcc/config/rs6000/rs6000.cc >> >> +++ b/gcc/config/rs6000/rs6000.cc >> >> @@ -14659,6 +14659,12 @@ print_operand_address (FILE *file, rtx x) >> >>else if (SYMBOL_REF_P (x) ||

Re: [Patch, aarch64] v3: Preparatory patch to place target independent and,dependent changed code in one file

2024-05-14 Thread Ajit Agarwal
Hello Alex: On 13/05/24 8:49 pm, Alex Coplan wrote: > Hi Ajit, > > Why did you send three mails for this revision of the patch? If you're > going to send a new revision of the patch you should increment the > version number and outline the changes / reasons for the new revision. > There were i

Re: [PATCH] c++: lvalueness of non-dependent assignment [PR114994]

2024-05-14 Thread Jason Merrill
On 5/11/24 20:46, Patrick Palka wrote: On Fri, 10 May 2024, Jason Merrill wrote: On 5/9/24 16:23, Patrick Palka wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for trunk/14? For trunk as a follow-up I can implement the mentionted representation change to use CALL_E

Re: [PATCH] c++: Strengthen checks on 'main'

2024-05-14 Thread Jason Merrill
On 5/11/24 08:32, Nathaniel Shead wrote: I wasn't entirely sure what to do with the 'abi/main.C' testcase here; is this OK, or should I e.g. lower the linkage error to a pedwarn for the purposes of this test? I think it should be a pedwarn anyway, since it's harmless. The others can still be

Re: [PATCH] c++/modules: Remember that header units have CMIs

2024-05-14 Thread Jason Merrill
On 5/12/24 22:58, Nathaniel Shead wrote: Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? OK. -- >8 -- This appears to be an oversight in the definition of module_has_cmi_p; this comes up transitively in other functions used for e.g. determining whether a name could potential

Re: [PATCH] c++: Optimize in maybe_clone_body aliases even when not at_eof [PR113208]

2024-05-14 Thread Jason Merrill
On 5/13/24 06:19, Jakub Jelinek wrote: On Fri, May 10, 2024 at 03:59:25PM -0400, Jason Merrill wrote: 2024-05-09 Jakub Jelinek Jason Merrill PR lto/113208 * cp-tree.h (maybe_optimize_cdtor): Remove. * decl2.cc (tentative_decl_linkage): Call maybe_make_on

Re: Follow up #1 (was Re: [PATCH v2 1/2] RISC-V: avoid LUI based const materialization ... [part of PR/106265])

2024-05-14 Thread Palmer Dabbelt
On Mon, 13 May 2024 16:08:21 PDT (-0700), Vineet Gupta wrote: On 5/13/24 15:47, Jeff Law wrote: On 5/13/24 11:49, Vineet Gupta wrote: 500.perlbench_r-0 | 1,214,534,029,025 | 1,212,887,959,387 | 500.perlbench_r-1 |740,383,419,739 | 739,280,308,163 | 500.perlbench_r-2 |692,074,

Re: [PATCH] testsuite: c++: Allow for std::printf in g++.dg/modules/stdio-1_a.H [PR98529]

2024-05-14 Thread Jason Merrill
On 5/13/24 08:10, Rainer Orth wrote: Hi Nathaniel, There are a couple of other tests that appear to potentially have a similar issue: global-2_a.C 21:// { dg-final { scan-lang-dump-not {Reachable GMF '::printf[^\n']*' added} module } } global-3_a.C 15:// { dg-final { scan-lang-dump-not {Reach

Re: [PATCH v2] c++: ICE with reference NSDMI [PR114854]

2024-05-14 Thread Jason Merrill
On 5/14/24 09:48, Marek Polacek wrote: On Thu, May 09, 2024 at 03:47:54PM -0400, Jason Merrill wrote: On 5/9/24 12:04, Marek Polacek wrote: Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- Here we crash on a cp_gimplify_expr/TARGET_EXPR assert: /* A TARGET_EXPR th

Re: [PATCH v2] c++: DR 569, DR 1693: fun with semicolons [PR113760]

2024-05-14 Thread Jason Merrill
On 5/14/24 12:55, Marek Polacek wrote: On Thu, May 09, 2024 at 12:44:52PM -0400, Jason Merrill wrote: On 5/9/24 12:16, Marek Polacek wrote: +static void +maybe_warn_extra_semi (location_t loc, extra_semi_kind kind) +{ + /* -Wno-extra-semi suppresses all. */ + if (warn_extra_semi == 0) +r

Re: [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324)

2024-05-14 Thread Jason Merrill
On 5/14/24 13:24, Andi Kleen wrote: Hi Jason, On Mon, May 06, 2024 at 11:02:20PM -0400, Jason Merrill wrote: @@ -30189,7 +30207,7 @@ cp_parser_std_attribute (cp_parser *parser, tree attr_ns) /* Maybe we don't expect to see any arguments for this attribute. */ const attribute_spe

Re: [PATCH] c++: add test for DR 2855

2024-05-14 Thread Jason Merrill
On 5/14/24 13:54, Marek Polacek wrote: Tested x86_64-pc-linux-gnu, OK to add such a test? OK. -- >8 -- Let int8_t x = 127; This DR says that while x++; invokes UB, ++x; does not. The resolution was to make the first one valid. The following test verifies that we don't report

PING Re: [PATCH RFA (cgraph)] c++: pragma target and static init [PR109753]

2024-05-14 Thread Jason Merrill
Ping On 5/2/24 09:54, Jason Merrill wrote: Tested x86_64-pc-linux-gnu, OK for trunk? 14.2? This two-year-old thread seems relevant: https://gcc.gnu.org/pipermail/gcc-patches/2022-April/593410.html -- 8< -- #pragma target and optimize should also apply to implicitly-generated functions li

[PATCH] [PATCH] Correct DLL Installation Path for x86_64-w64-mingw32 Multilib [PR115094]

2024-05-14 Thread trcrsired
From: trcrsired When building native GCC for the x86_64-w64-mingw32 host, the compiler copies its library DLLs to the `bin` directory. However, in the case of a multilib configuration, both 32-bit and 64-bit libraries end up in the same `bin` directory, leading to conflicts where 64-bit DLLs a

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Kees Cook
On Tue, May 14, 2024 at 02:17:16PM +, Qing Zhao wrote: > The current major issue with the warning is: the constant index value 4 > is not in the source code, it’s a compiler generated intermediate value > (even though it’s a correct value -:)). Such warning messages confuse > the end-users wit

[COMMITTED] pru: Implement TARGET_CLASS_LIKELY_SPILLED_P to fix PR115013

2024-05-14 Thread Dimitar Dimitrov
Commit r15-436-g44e7855e did not fix PR115013 for PRU because SMALL_REGISTER_CLASS_P is not returning an accurate value for the PRU backend. Word mode for PRU backend is defined as 8-bit, yet all ALU operations are preferred in 32-bit mode. Thus checking whether a register class contains a single

Re: [PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-14 Thread Andrew MacLeod
On 5/9/24 04:47, HAO CHEN GUI wrote: Hi Mikael, Thanks for your comments. 在 2024/5/9 16:03, Mikael Morin 写道: I think the canonical API behaviour sets R to varying and returns true instead  of just returning false if nothing is known about the range. I'm not sure whether it makes any diff

[COMMITTED] RISC-V: avoid LUI based const materialization ... [part of PR/106265]

2024-05-14 Thread Vineet Gupta
... if the constant can be represented as sum of two S12 values. The two S12 values could instead be fused with subsequent ADD insn. The helps - avoid an additional LUI insn - side benefits of not clobbering a reg e.g. w/o patch w/ patch long

Re: [Patch, aarch64] v4: Preparatory patch to place target independent and,dependent changed code in one file

2024-05-14 Thread Alex Coplan
Hi Ajit, Please can you pay careful attention to the review comments? In particular, you have ignored my comment about changing the access of member functions in ldp_bb_info several times now (on at least three patch reviews). Likewise on multiple occasions you've only partially implemented a pi

Re: [PATCH v8] C, ObjC: Add -Wunterminated-string-initialization

2024-05-14 Thread Alejandro Colomar
Ping. I see that GCC-14 has been released recently. This is a gentle ping to see if this is a better time for this patch. Have a lovely day! Alex signature.asc Description: PGP signature

[PATCH] c++: add test for DR 2855

2024-05-14 Thread Marek Polacek
Tested x86_64-pc-linux-gnu, OK to add such a test? -- >8 -- Let int8_t x = 127; This DR says that while x++; invokes UB, ++x; does not. The resolution was to make the first one valid. The following test verifies that we don't report any errors in a constexpr context. DR 2855

Re: [PATCH 1/4] rs6000: Make all 128 bit scalar FP modes have 128 bit precision [PR112993]

2024-05-14 Thread Kewen.Lin
Hi Joseph and Richi, Thanks for the suggestions and comments! on 2024/5/10 14:31, Richard Biener wrote: > On Thu, May 9, 2024 at 9:12 PM Joseph Myers wrote: >> >> On Wed, 8 May 2024, Kewen.Lin wrote: >> >>> to widen IFmode to TFmode. To make build_common_tree_nodes >>> be able to find the corre

Re: [PATCH] AARCH64: Add Qualcomnm oryon-1 core

2024-05-14 Thread Kyrill Tkachov
Hi Andrew, On Fri, May 3, 2024 at 8:50 PM Andrew Pinski wrote: > This patch adds Qualcomm's new oryon-1 core; this is enough > to recongize the core and later on will add the tuning structure. > > gcc/ChangeLog: > > * config/aarch64/aarch64-cores.def (oryon-1): New entry. > * con

Re: [PATCH v5 2/5] C++: Support clang compatible [[musttail]] (PR83324)

2024-05-14 Thread Andi Kleen
Hi Jason, On Mon, May 06, 2024 at 11:02:20PM -0400, Jason Merrill wrote: > > @@ -30189,7 +30207,7 @@ cp_parser_std_attribute (cp_parser *parser, tree > > attr_ns) > > /* Maybe we don't expect to see any arguments for this attribute. */ > > const attribute_spec *as > > = looku

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Qing Zhao
> On May 14, 2024, at 13:14, Richard Biener wrote: > > On Tue, 14 May 2024, Qing Zhao wrote: > >> >> >>> On May 14, 2024, at 10:29, Richard Biener wrote: >>> > [...] >>> It would of course >>> need experimenting since we can end up moving stmts and merging blocks >>> though the linear tra

Re: [PATCH v5 1/5] Improve must tail in RTL backend

2024-05-14 Thread Andi Kleen
> > diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > index 3a6d4cceaba7..44af361e2925 100644 > > --- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > +++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > @@ -1,4 +1,5 @@ > >

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Richard Biener
On Tue, 14 May 2024, Qing Zhao wrote: > > > > On May 14, 2024, at 10:29, Richard Biener wrote: > > [...] > > It would of course > > need experimenting since we can end up moving stmts and merging blocks > > though the linear traces created by jump threading should be quite > > stable (as oppo

Re: [PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-14 Thread HAO CHEN GUI
Hi Mikael, Thanks for your comments. 在 2024/5/9 16:03, Mikael Morin 写道: > I think the canonical API behaviour sets R to varying and returns true > instead of just returning false if nothing is known about the range. > > I'm not sure whether it makes any difference; Aldy can probably tell. But

Re: [PATCH v5 5/5] Add documentation for musttail attribute

2024-05-14 Thread Richard Biener
On Tue, May 14, 2024 at 6:30 PM Andi Kleen wrote: > > > Looks generally OK though does this mean people can debug > > programs using [[gnu::musttail]] only with optimized builds? It > > seems to me we should try harder to make [[gnu::musttail]] work > > at -O0 and generally behave the same at all

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-14 Thread Jeff Law
On 5/14/24 10:36 AM, Vineet Gupta wrote: On 5/14/24 08:44, Jeff Law wrote: On 5/14/24 8:51 AM, Patrick O'Neill wrote: I was able to find the summary info: Tests that now fail, but worked before (15 tests): libgomp: libgomp.fortran/simd7.f90   -O0  execution test libgomp: libgomp.fortran/

[PATCH v2] c++: DR 569, DR 1693: fun with semicolons [PR113760]

2024-05-14 Thread Marek Polacek
On Thu, May 09, 2024 at 12:44:52PM -0400, Jason Merrill wrote: > On 5/9/24 12:16, Marek Polacek wrote: > > +static void > > +maybe_warn_extra_semi (location_t loc, extra_semi_kind kind) > > +{ > > + /* -Wno-extra-semi suppresses all. */ > > + if (warn_extra_semi == 0) > > +return; > > + > >

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Kees Cook
On Mon, May 13, 2024 at 07:48:30PM +, Qing Zhao wrote: > The false positive warnings are moved from -Warray-bounds=1 to > -Warray-bounds=2 now. On a Linux kernel x86_64 allmodconfig build, this takes the -Warray-bounds warnings from 14 to 9. After examining these 9, I see: - 4: legitimate bu

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-14 Thread Vineet Gupta
On 5/14/24 08:44, Jeff Law wrote: > On 5/14/24 8:51 AM, Patrick O'Neill wrote: >>> I was able to find the summary info: >>> Tests that now fail, but worked before (15 tests): libgomp: libgomp.fortran/simd7.f90   -O0  execution test libgomp: libgomp.fortran/task2.f90   -O0  executi

Re: [PATCH 11/13] rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in

2024-05-14 Thread Kewen.Lin
Hi, on 2024/4/20 05:18, Carl Love wrote: > rs6000, remove __builtin_vsx_xvcmpeqsp_p built-in > > The built-in __builtin_vsx_xvcmpeqsp_p is a duplicate of the overloaded > __builtin_altivec_vcmpeqfp_p built-in. The built-in is undocumented and > there are no test cases for it. The patch removes

Re: [PATCH v5 5/5] Add documentation for musttail attribute

2024-05-14 Thread Andi Kleen
> Looks generally OK though does this mean people can debug > programs using [[gnu::musttail]] only with optimized builds? It > seems to me we should try harder to make [[gnu::musttail]] work > at -O0 and generally behave the same at all optimization levels? Yes that's a fair point. The problem i

Re: [r15-429 Regression] FAIL: experimental/simd/pr109261_constexpr_simd.cc -msse2 -O2 -Wno-psabi (test for excess errors) on Linux/x86_64

2024-05-14 Thread Matthias Kretz
On Dienstag, 14. Mai 2024 17:42:09 MESZ Jiang, Haochen wrote: > Hi Matthias, > > From my side, I get several error like this: > > /export/users/haochenj/src/gcc-bisect/master/master/r15-429/bld/x86_64-linux > /32/libstdc++-v3/include/experimental/bits/simd_builtin.h:131: error: could > not conver

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Qing Zhao
> On May 14, 2024, at 11:08, Jeff Law wrote: > > > > On 5/14/24 8:57 AM, Qing Zhao wrote: >>> On May 13, 2024, at 20:14, Kees Cook wrote: >>> >>> On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote: On Mon, May 13, 2024, 11:41 PM Kees Cook wrote: > But it makes no sense

Re: [PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-14 Thread Jakub Jelinek
On Tue, May 07, 2024 at 10:37:55AM +0800, HAO CHEN GUI wrote: > The former patch adds isfinite optab for __builtin_isfinite. > https://gcc.gnu.org/pipermail/gcc-patches/2024-April/649339.html > > Thus the builtin might not be folded at front end. The range op for > isfinite is needed for value

New Swedish PO file for 'gcc' (version 14.1.0)

2024-05-14 Thread Translation Project Robot
Hello, gentle maintainer. This is a message from the Translation Project robot. A revised PO file for textual domain 'gcc' has been submitted by the Swedish team of translators. The file is available at: https://translationproject.org/latest/gcc/sv.po (This file, 'gcc-14.1.0.sv.po', has ju

Re: [PATCHv2] Value range: Add range op for __builtin_isfinite

2024-05-14 Thread Andrew MacLeod
On 5/13/24 22:16, HAO CHEN GUI wrote: Hi Aldy, Thanks for your review comments. 在 2024/5/13 19:18, Aldy Hernandez 写道: +//Implement range operator for CFN_BUILT_IN_ISFINITE +class cfn_isfinite : public range_operator +{ +public: + using range_operator::fold_range; + using range_operator::

[PATCH] tree-cfg: Move the returns_twice check to be last statement only [PR114301]

2024-05-14 Thread Andrew Pinski
When I was checking to making sure that all of the bugs dealing with the case where gimple_can_duplicate_bb_p would return false was fixed, I noticed that the code which was checking if a call statement was returns_twice was checking all call statements rather than just the last statement. Since ca

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-14 Thread Jeff Law
On 5/14/24 8:51 AM, Patrick O'Neill wrote: I was able to find the summary info: Tests that now fail, but worked before (15 tests): libgomp: libgomp.fortran/simd7.f90   -O0  execution test libgomp: libgomp.fortran/task2.f90   -O0  execution test libgomp: libgomp.fortran/vla2.f90   -O0  ex

Re: Ping [PATCH/RFC] target, hooks: Allow a target to trap on unreachable [PR109267].

2024-05-14 Thread Iain Sandoe
> On 14 May 2024, at 14:29, Richard Biener wrote: > > On Wed, May 8, 2024 at 9:37 PM Iain Sandoe wrote: >> >> Hi Folks, >> >> I’d like to land a viable solution to this issue if possible, (it is a show- >> stopper for the aarch64-darwin development branch). > > I was looking as to how we h

RE: [r15-429 Regression] FAIL: experimental/simd/pr109261_constexpr_simd.cc -msse2 -O2 -Wno-psabi (test for excess errors) on Linux/x86_64

2024-05-14 Thread Jiang, Haochen
Hi Matthias, From my side, I get several error like this: /export/users/haochenj/src/gcc-bisect/master/master/r15-429/bld/x86_64-linux/32/libstdc++-v3/include/experimental/bits/simd_builtin.h:131: error: could not convert 'std::experimental::parallelism_v2::__vec_shuffle<__vector(4) wchar_t, _

Re: [PATCH] Fix PR c++/105760: ICE in build_deduction_guide for invalid template

2024-05-14 Thread Simon Martin
On 6 May 2024, at 18:28, Jason Merrill wrote: On 5/6/24 09:20, Simon Martin wrote: Hi, We currently ICE upon the following invalid snippet because we fail to properly handle tsubst_arg_types returning error_mark_node in build_deduction_guide. == cut == template struct A { A(Ts...); }; A a;

Re: [PATCH 1/3] expr: Export clear_by_pieces()

2024-05-14 Thread Jeff Law
On 5/7/24 11:38 PM, Christoph Müllner wrote: Make clear_by_pieces() available to other parts of the compiler, similar to store_by_pieces(). gcc/ChangeLog: * expr.cc (clear_by_pieces): Remove static from clear_by_pieces. * expr.h (clear_by_pieces): Add prototype for clear_by_p

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Qing Zhao
> On May 14, 2024, at 10:29, Richard Biener wrote: > > On Tue, 14 May 2024, Qing Zhao wrote: > >> >> >>> On May 14, 2024, at 09:08, Richard Biener wrote: >>> >>> On Mon, 13 May 2024, Qing Zhao wrote: >>> -Warray-bounds is an important option to enable linux kernal to keep the ar

[commited, gcc13] ipa: Compare jump functions in ICF (PR 113907)

2024-05-14 Thread Martin Jambor
Hi, This is a manual backport of r14-9840-g1162861439fd3c from master. Manual because the bits and value range representation in jump functions have changes during the gcc 14 development cycle. In PR 113907 comment #58, Honza found a case where ICF thinks bodies of functions are equivalent but be

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Jeff Law
On 5/14/24 8:57 AM, Qing Zhao wrote: On May 13, 2024, at 20:14, Kees Cook wrote: On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote: On Mon, May 13, 2024, 11:41 PM Kees Cook wrote: But it makes no sense to warn about: void sparx5_set (int * ptr, struct nums * sg, int index)

[PATCH 12/12] aarch64: Extend aarch64_feature_flags to 128 bits

2024-05-14 Thread Andrew Carlotti
Replace the existing typedef with a new class containing two private uint64_t members. Most of the preparatory work was carried out in previous commits. The most notable remaining changes are the addition of the get_isa_mode and with_isa_mode functions for conversion to or from aarch64_isa_mode t

[PATCH 09/12] aarch64: Assign flags to local constexpr variable

2024-05-14 Thread Andrew Carlotti
This guarantees that the constant values are actually evaluated at compile time. In previous testing, I have observed GCC failing to evaluate and inline these constant values, which exposed a separate bug in which some of the required symbols from feature_deps were missing. Richard Sandiford has

[RFC 11/12] Add explicit bool casts to .md condition users

2024-05-14 Thread Andrew Carlotti
This patch is one way to fix some issues I discovered when disallowing implicit casts to bool from aarch64_feature_flags (in a later patch). That in turn was necessary to prohibit accidental conversion of an aarch64_feature_flags value to an integer by first implicitly casting to a bool (and thus s

[PATCH 10/12] aarch64: Add aarch64_feature_flags_from_index macro

2024-05-14 Thread Andrew Carlotti
When aarch64_feature_flags grows to 128 bits, constructing a mask with a specific indexed value set will become more complicated. Extract this operation into a separate macro, and preemptively annotate the feature masks as possibly unused. gcc/ChangeLog: * config/aarch64/aarch64-opts.h

[PATCH 08/12] aarch64: Decouple feature flag option storage type

2024-05-14 Thread Andrew Carlotti
The awk scripts that process the .opt files are relatively fragile and only handle a limited set of data types correctly. The unrecognised aarch64_feature_flags type is handled as a uint64_t, which happens to be correct for now. However, that assumption will change when we extend the mask to 128

Re: [RFC][PATCH] PR tree-optimization/109071 - -Warray-bounds false positive warnings due to code duplication from jump threading

2024-05-14 Thread Qing Zhao
> On May 13, 2024, at 20:14, Kees Cook wrote: > > On Tue, May 14, 2024 at 01:38:49AM +0200, Andrew Pinski wrote: >> On Mon, May 13, 2024, 11:41 PM Kees Cook wrote: >>> But it makes no sense to warn about: >>> >>> void sparx5_set (int * ptr, struct nums * sg, int index) >>> { >>> if (index >

[PATCH 07/12] aarch64: Define aarch64_get_{asm_|}isa_flags

2024-05-14 Thread Andrew Carlotti
Building an aarch64_feature_flags value from data within a gcc_options or cl_target_option struct will get more complicated in a later commit. Use a macro to avoid doing this manually in more than one location. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc (aarch64_hand

[PATCH 06/12] aarch64: Introduce aarch64_isa_mode type

2024-05-14 Thread Andrew Carlotti
Currently there are many places where an aarch64_feature_flags variable is used, but only the bottom three isa mode bits are set and read. Using a separate data type for these value makes it more clear that they're not expected or required to have any of their upper feature bits set. It will also

[PATCH 05/12] aarch64: Eliminate a temporary variable.

2024-05-14 Thread Andrew Carlotti
The name would become misleading in a later commit anyway, and I think this is marginally more readable. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_override_options): Remove temporary variable. diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc i

[PATCH 03/12] aarch64: Don't use 0 for aarch64_feature_flags

2024-05-14 Thread Andrew Carlotti
Replace all uses of 0 for aarch64_feature_flags variable initialisation with the (almost) new macro AARCH64_NO_FEATURES. This is needed because a later commit will disallow casts to aarch64_feature_flags from integer types. gcc/ChangeLog: * common/config/aarch64/aarch64-common.cc

[PATCH 04/12] aarch64: Don't compare aarch64_feature_flags to 0.

2024-05-14 Thread Andrew Carlotti
A later commit will disallow such comparisons. We can instead convert directly to a boolean value, and make sure all such conversions are explicit. TODO: FIX SYSREG GATING. gcc/ChangeLog: * config/aarch64/aarch64-sve-builtins.cc (check_required_extensions): Replace comparison wi

[PATCH 01/12] aarch64: Remove unused global aarch64_tune_flags

2024-05-14 Thread Andrew Carlotti
gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_tune_flags): Remove unused global variable. (aarch64_override_options_internal): Remove dead assignment. diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 662ff5a9b0c715d0cab0ae4ba63af1b3c

[PATCH 02/12] aarch64: Move AARCH64_NUM_ISA_MODES definition

2024-05-14 Thread Andrew Carlotti
AARCH64_NUM_ISA_MODES will be used within aarch64-opts.h in a later commit. gcc/ChangeLog: * config/aarch64/aarch64.h (DEF_AARCH64_ISA_MODE): Move to... * config/aarch64/aarch64-opts.h (DEF_AARCH64_ISA_MODE): ...here. diff --git a/gcc/config/aarch64/aarch64-opts.h b/gcc/config/

[PATCH 00/12] aarch64: Extend aarch64_feature_flags to 128 bits

2024-05-14 Thread Andrew Carlotti
The end goal of the series is to change the definition of aarch64_feature_flags from a uint64_t typedef to a class with 128 bits of storage. This class uses operator overloading to mimic the existing integer interface as much as possible, but with added restrictions to facilate type checking and e

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-14 Thread Jeff Law
On 5/14/24 8:51 AM, Patrick O'Neill wrote: On 5/13/24 20:36, Jeff Law wrote: On 5/13/24 6:54 PM, Patrick O'Neill wrote: On 5/13/24 13:28, Jeff Law wrote: On 5/13/24 12:49 PM, Vineet Gupta wrote: If the constant used for stack offset can be expressed as sum of two S12 values, the cons

[PATCH][v2] tree-optimization/99954 - redo loop distribution memcpy recognition fix

2024-05-14 Thread Richard Biener
The following revisits the fix for PR99954 which was observed as causing missed memcpy recognition and instead using memmove for non-aliasing copies. While the original fix mitigated bogus recognition of memcpy the root cause was not properly identified. The root cause is dr_analyze_indices "faili

[to-be-committed][RISC-V] Remove redundant AND in shift-add sequence

2024-05-14 Thread Jeff Law
So this patch allows us to eliminate an redundant AND in some shift-add style sequences. I think the testcase was reduced from xz by the RAU team, but I'm not highly confident of that. Specifically the AND is masking off the upper 32 bits of the un-shifted value and there's an outer SIGN_EXT

Re: [PATCH v2 2/2] RISC-V: avoid LUI based const mat in prologue/epilogue expansion [PR/105733]

2024-05-14 Thread Patrick O'Neill
On 5/13/24 20:36, Jeff Law wrote: On 5/13/24 6:54 PM, Patrick O'Neill wrote: On 5/13/24 13:28, Jeff Law wrote: On 5/13/24 12:49 PM, Vineet Gupta wrote: If the constant used for stack offset can be expressed as sum of two S12 values, the constant need not be materialized (in a reg) and

  1   2   >