Re: [pushed] c++: reduce unnecessary tree_common

2024-11-11 Thread Richard Biener
On Mon, Nov 11, 2024 at 5:33 PM Jason Merrill wrote: > > Tested x86_64-pc-linux-gnu, applying to trunk. Don't you need to adjust cp_common_init_ts () as well for this? Oddly enough PTRMEM_CST is already TS_TYPED there and others are not adjusted at all. Richard. > -- 8< -- > > Lewis' r15-5067

RE: [PATCH][v2] tree-optimization/117502 - VMAT_STRIDED_SLP vs VMAT_ELEMENTWISE when considering gather

2024-11-11 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Monday, November 11, 2024 10:13 AM > To: gcc-patches@gcc.gnu.org > Cc: RISC-V CI ; Tamar Christina > > Subject: [PATCH][v2] tree-optimization/117502 - VMAT_STRIDED_SLP vs > VMAT_ELEMENTWISE when considering gather > > The following trea

[PATCH v5] c: add Wzero-as-null-pointer-constant [PR117059]

2024-11-11 Thread Martin Uecker
Added tests with some non-NPC pointers converted to bool. (BTW: For some reason we allowed 0 == nullptr but not x ? 0 : nullptr in ISO C.) Bootstrapped and regression tested on x86_64. commit 5a29c43cca6fa5f50ad8266c5969a9420ef2488e Author: Martin Uecker Date: Sat Nov 9 10:48:52 2024 +0100

[PATCH] Fix incorrect subreg mode check [PR117476]

2024-11-11 Thread Alexey Merzlyakov
gcc/ChangeLog: * simplify-rtx.cc (simplify_context::simplify_unary_operation_1): Fix subreg mode check during zero_extend(not) -> xor optimization. gcc/testsuite/ChangeLog: * gcc.dg/pr117476.c: New test. Signed-off-by: Alexey Merzlyakov --- gcc/simplify-rtx.cc

Re: [PATCH V2 1/11] Add rs6000 architecture masks.

2024-11-11 Thread Michael Meissner
On Fri, Nov 08, 2024 at 05:12:13PM -0600, Segher Boessenkool wrote: > On Fri, Nov 08, 2024 at 02:28:11PM -0600, Peter Bergner wrote: > > On 11/8/24 1:44 PM, Michael Meissner wrote: > > > diff --git a/gcc/config/rs6000/rs6000-arch.def > > > b/gcc/config/rs6000/rs6000-arch.def > > > new file mode 10

Re: [PATCH V2 1/11] Add rs6000 architecture masks.

2024-11-11 Thread Michael Meissner
On Fri, Nov 08, 2024 at 02:28:11PM -0600, Peter Bergner wrote: > On 11/8/24 1:44 PM, Michael Meissner wrote: > > diff --git a/gcc/config/rs6000/rs6000-arch.def > > b/gcc/config/rs6000/rs6000-arch.def > > new file mode 100644 > > index 000..e5b6e958133 > > --- /dev/null > > +++ b/gcc/config

Re: [PATCH v2] MATCH: Simplify `min(a, b) op max(a, b)` to `a op b` [PR109401]

2024-11-11 Thread Eikansh Gupta
Hi, > It seems to me this ought to work when the min/max reversed as well, > or >am I missing something? Yes, it should work when min/max are reversed. Regards, Eikansh From: Jeff Law Sent: Tuesday, November 12, 2024 12:55 AM To: Eikansh Gupta (QUI

Re: [PATCH] Match: Optimize log (x) CMP CST and exp (x) CMP CST operations

2024-11-11 Thread Soumya AR
Thanks, committed: e232dc3bb5c3e8f8a3749239135b7b859a204fc7 Best, Soumya > On 7 Nov 2024, at 3:32 AM, Jeff Law wrote: > > External email: Use caution opening links or attachments > > > On 11/6/24 1:12 AM, Soumya AR wrote: >> >> >>> On 29 Oct 2024, at 6:59 PM, Richard Biener wrote: >>> >>>

Re: [committed] contrib: Add 2 further ignored commits

2024-11-11 Thread Alexandre Oliva
On Nov 10, 2024, Jakub Jelinek wrote: > On Sun, Nov 10, 2024 at 01:30:06PM -0300, Alexandre Oliva wrote: >> I'm surprised the commit-time checker didn't catch them. > I'm surprised too, but don't want to try to push further broken commits just > to double check that. ;) You can leave that test

Re: [PATCH 2/2] Add X86_TUNE_AVX512_TWO_EPILOGUES, enable for Zen4 and Zen5

2024-11-11 Thread Hongtao Liu
On Mon, Nov 11, 2024 at 8:20 PM Richard Biener wrote: > > The following adds X86_TUNE_AVX512_TWO_EPILOGUES tuning and directs the > vectorizer to produce both a vector AVX2 and SSE epilogue for AVX512 > vectorized loops when set. The tuning is enabled by default for Zen4 > and Zen5 where I benchm

Re: [PATCH v4] c: add Wzero-as-null-pointer-constant [PR117059]

2024-11-11 Thread Joseph Myers
On Mon, 11 Nov 2024, Martin Uecker wrote: > The fourth version also adds a documentation change. For now, > I simply decided to remove the comment about C++11 because it > seems not that useful anymore with nullptr_t now fully > established in C++. I added the requested tests at the end  > (I h

Re: [PATCH] c: Handle C23 floating constant {d, D}{32, 64, 128} suffixes like {df,dd,dl}

2024-11-11 Thread Joseph Myers
On Mon, 11 Nov 2024, Jakub Jelinek wrote: > Hi! > > C23 roughly says that {d,D}{32,64,128} floating point constant suffixes > are alternate spellings of {df,dd,dl} suffixes in annex H. > > So, the following patch allows that alternate spelling. This is OK. > Or is it intentional it isn't enabl

Re: Fwd: [PATCH v3] C: Support Function multiversionsing in the C front end

2024-11-11 Thread Joseph Myers
On Mon, 11 Nov 2024, Alfie Richards wrote: > I see this code is very unclear in hindsight. The logic of this code > relies on that FMV functions are only allowed at file scope. > This should have a DECL_FILE_SCOPE_P check to avoid some of the ridiculous > cases you mentioned. If you have an actua

Re: [PATCH] c: Implement C2Y N3298 - Introduce complex literals [PR117029]

2024-11-11 Thread Joseph Myers
On Mon, 11 Nov 2024, Jakub Jelinek wrote: > Hi! > > The following patch implements the C2Y N3298 paper Introduce complex literals > by providing different (or no) diagnostics on imaginary constants (except > for integer ones). > For _DecimalN constants we don't support _Complex _DecimalN and erro

Re: [committed] contrib: Add 2 further ignored commits

2024-11-11 Thread Joseph Myers
On Sun, 10 Nov 2024, Jakub Jelinek wrote: > On Sun, Nov 10, 2024 at 01:30:06PM -0300, Alexandre Oliva wrote: > > On Nov 9, 2024, Jakub Jelinek wrote: > > > > > r15-4998 and r15-5004 had wrong commit message, add those to > > > ignored commits. > > > > Ugh, sorry and thanks. > > Was that .c vs

[pushed] c++: rename -fmodules-ts to -fmodules

2024-11-11 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- The C++ modules support is not targeting the Modules TS, so it doesn't make much sense to refer to the TS in the option name. But keep the old spelling as an undocumented alias for now. gcc/ChangeLog: * doc/invoke.texi: Rename -fm

[pushed] c++: include libcody in TAGS

2024-11-11 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- The C++ front-end uses symbols from these directories, so they should also be in TAGS. gcc/cp/ChangeLog: * Make-lang.in: Also collect tags from libcody and c++tools. --- gcc/cp/Make-lang.in | 5 +++-- 1 file changed, 3 insertions(

Re: [RFC/RFA] [PATCH v7 08/12] Add a new pass for naive CRC loops detection.

2024-11-11 Thread Jeff Law
On 11/9/24 12:44 PM, Mariam Arutunian wrote: This patch adds a new compiler pass aimed at identifying naive CRC implementations, characterized by the presence of a loop calculating a CRC (polynomial long division). Upon detection of a potential CRC, the pass prints an informational message.

[pushed] opts: fix narrowing warning

2024-11-11 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying as obvious. -- 8< -- The init-list initialization of cl_deferred_option p had a couple of narrowing warnings: first of opt_index from int to size_t and then of value from HOST_WIDE_INT to int. Fixed by making the types more consistent. gcc/ChangeLog:

Re: [RFC/RFA][PATCH v6 03/12] RISC-V: Add CRC expander to generate faster CRC.

2024-11-11 Thread Jeff Law
On 11/9/24 11:39 AM, Mariam Arutunian wrote: If the target is ZBC or ZBKC, it uses clmul instruction for the CRC calculation.  Otherwise, if the target is ZBKB, generates table-based CRC, but for reversing inputs and the output uses bswap and brev8 instructions.  Add new tests to check CRC gen

Re: [PATCH v3] [GCCJIT] support dynamic alloca stub

2024-11-11 Thread Antoni Boucher
Hi and thanks for the patch. I would rather avoid having to hard-code the types of built-in functions, especially since we can already access them via the function gcc_jit_context_get_target_builtin_function that is available in this patch: https://gcc.gnu.org/pipermail/jit/2023q4/001725.html

Re: [RFC/RFA] [PATCH v7 01/12] Implement internal functions for efficient CRC computation.

2024-11-11 Thread Jeff Law
On 11/11/24 1:30 PM, Jeff Law wrote: + +void +emit_crc (machine_mode crc_mode, rtx* crc, rtx* op0) +{ +  if (word_mode != crc_mode) +    { +  rtx tgt = simplify_gen_subreg (word_mode, *op0, crc_mode, 0); Can CRC_MODE ever be wider than WORD_MODE? If so, then that last argument needs adj

Re: [RFC/RFA] [PATCH v7 01/12] Implement internal functions for efficient CRC computation.

2024-11-11 Thread Jeff Law
On 11/9/24 12:43 PM, Mariam Arutunian wrote: Add two new internal functions (IFN_CRC, IFN_CRC_REV), to provide faster CRC generation. One performs bit-forward and the other bit-reversed CRC computation. If CRC optabs are supported, they are used for the CRC computation. Otherwise, table-based

Re: [PATCH v4] MATCH: Simplify `a rrotate (32-b) -> a lrotate b` [PR109906]

2024-11-11 Thread Jeff Law
On 11/11/24 4:36 AM, Eikansh Gupta wrote: The pattern `a rrotate (32-b)` should be optimized to `a lrotate b`. The same is also true for `a lrotate (32-b)`. It can be optimized to `a rrotate b`. This patch adds following patterns: a rrotate (32-b) -> a lrotate b a lrotate (32-b) -> a rrotate

Re: [PATCH] c++: Relax checking assert about elision to support -fno-elide-constructors [PR114619]

2024-11-11 Thread Simon Martin
Hi, On 30 Oct 2024, at 11:46, Simon Martin wrote: > On 19 Oct 2024, at 11:09, Simon Martin wrote: > >> We currently ICE in checking mode with cxx_dialect < 17 on the >> following >> valid code >> >> === cut here === >> struct X { >> X(const X&) {} >> }; >> extern X x; >> void foo () { >> new

Re: [PATCH v8] c++: Fix overeager Woverloaded-virtual with conversion operators [PR109918]

2024-11-11 Thread Simon Martin
Hi, On 30 Oct 2024, at 11:44, Simon Martin wrote: > Friendly ping. Friendly ping. Thanks! Simon > > On 16 Oct 2024, at 17:43, Simon Martin wrote: > >> Hi Jason, >> >> On 12 Oct 2024, at 4:51, Jason Merrill wrote: >> >>> On 10/11/24 7:02 AM, Simon Martin wrote: Hi Jason, On 11 Oct

[PATCH v4] c: add Wzero-as-null-pointer-constant [PR117059]

2024-11-11 Thread Martin Uecker
The fourth version also adds a documentation change. For now, I simply decided to remove the comment about C++11 because it seems not that useful anymore with nullptr_t now fully established in C++. I added the requested tests at the end  (I hope I understood your comment correctly). Bootst

Re: [PATCH] RISC-V: Load VLS perm indices directly from memory.

2024-11-11 Thread Jeff Law
On 11/11/24 7:12 AM, Robin Dapp wrote: Hi, instead of loading the permutation indices and using vmslt in order to determine which elements belong to which source vector we can compute the proper mask at compile time. That way we can emit vlm instead of vle + vmslt. Regtested on rv64gcv. Re

Re: [PATCH v2] MATCH: Simplify `min(a, b) op max(a, b)` to `a op b` [PR109401]

2024-11-11 Thread Jeff Law
On 11/11/24 4:55 AM, Eikansh Gupta wrote: This patch simplify `min(a,b) op max(a,b)` to `a op b`. This optimization will work for all the binary commutative operations. So, the `op` here can be one of {plus, mult, bit_and, bit_xor, bit_ior, eq, ne, min, max}. PR tree-optimization/1098

[to-be-committed][RISC-V] Drop undesirable two instruction macc alternatives

2024-11-11 Thread Jeff Law
So I was looking at sub_dct a little while ago and was surprised to see us emit two instructions out of a single pattern. We generally try to avoid that -- it's not always possible, but as a general rule of thumb it should be avoided. Specifically I saw: vmv1r.v v4,v2 # 138 [c=4

Re: [PATCH] c++: Add __builtin_operator_{new,delete} support

2024-11-11 Thread Sam James
Jakub Jelinek writes: > On Mon, Nov 11, 2024 at 06:47:43PM +, Sam James wrote: >> > Bootstrapped/regtested successfully on x86_64-linux and i686-linux. >> >> Maybe tag PR110137 given it's very related (and of interest to people >> CC'd on the bug). > > It is maybe related, but it is a distin

Re: [PATCH] c++: Add __builtin_operator_{new,delete} support

2024-11-11 Thread Jakub Jelinek
On Mon, Nov 11, 2024 at 06:47:43PM +, Sam James wrote: > > Bootstrapped/regtested successfully on x86_64-linux and i686-linux. > > Maybe tag PR110137 given it's very related (and of interest to people > CC'd on the bug). It is maybe related, but it is a distinct enhancement, so I've committed

Re: [PATCH] RISC-V: testsuite: Remove deprecated compatibility headers

2024-11-11 Thread Jeff Law
On 11/11/24 11:16 AM, Edwin Lu wrote: Since r15-4981-g5c34f02ba7e these tests have been failing on vector targets with excess errors due to the new deprecation warning message. Remove the header. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/bug-10.C: Remove cstdalign header.

Re: [PATCH 2/2] Add X86_TUNE_AVX512_TWO_EPILOGUES, enable for Zen4 and Zen5

2024-11-11 Thread Richard Biener
> Am 11.11.2024 um 18:09 schrieb Jan Hubicka : > >  >> >> The following adds X86_TUNE_AVX512_TWO_EPILOGUES tuning and directs the >> vectorizer to produce both a vector AVX2 and SSE epilogue for AVX512 >> vectorized loops when set. The tuning is enabled by default for Zen4 >> and Zen5 where

[committed] libstdc++: Fix typos in iterator increment for std::text_encoding [PR117520]

2024-11-11 Thread Jonathan Wakely
The intended behaviour for std::text_encoding::aliases_view's iterator is that it incrementing or decrementing too far sets it to a value-initialized state, or fails an assertion when those are enabled. There were typos that used == instead of = which meant that instead of becoming singular or abor

[committed] libstdc++: Improve exception messages in conversion classes

2024-11-11 Thread Jonathan Wakely
The std::logic_error exceptions thrown from misuses of std::wbuffer_convert and std::wstring_convert should use names qualified with "std::". libstdc++-v3/ChangeLog: * include/bits/locale_conv.h (wstring_convert, wbuffer_convert): Adjust strings passed to exception constructors. -

[committed] libstdc++: Add parentheses around operand of |

2024-11-11 Thread Jonathan Wakely
libstdc++-v3/ChangeLog: * include/bits/unicode.h (_Utf_iterator::_M_read_utf16): Add parentheses. --- Tested x86_64-linux. Pushed to trunk. Will backport to gcc-14 too. libstdc++-v3/include/bits/unicode.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libstdc

[committed] testsuite: Require atomic operations for c2y-if-decls-*

2024-11-11 Thread Dimitar Dimitrov
Since some of the c2y-if-decls tests use _Atomic, add a requirement for target to support atomic operations on int and long types. This fixes spurious test link failures on pru-unknown-elf, which lacks atomic ops. The tests still pass on x86_64-linux-gnu. Pushed to trunk as obvious. gcc/testsui

Re: [PATCH] c++: Add __builtin_operator_{new,delete} support

2024-11-11 Thread Sam James
Jakub Jelinek writes: > On Fri, Nov 08, 2024 at 06:40:16PM +0100, Jakub Jelinek wrote: >> clang++ adds __builtin_operator_{new,delete} builtins which as documented >> work similarly to ::operator {new,delete}, except that it is an error >> if the called ::operator {new,delete} is not a replaceabl

Re: [PATCH] c++: Add __builtin_operator_{new,delete} support

2024-11-11 Thread Jason Merrill
On 11/8/24 12:40 PM, Jakub Jelinek wrote: Hi! clang++ adds __builtin_operator_{new,delete} builtins which as documented work similarly to ::operator {new,delete}, except that it is an error if the called ::operator {new,delete} is not a replaceable global operator and allow optimizations which C

Re: [PATCH v2] c++: Fix another crash with invalid new operators [PR117463]

2024-11-11 Thread Jason Merrill
On 11/11/24 9:23 AM, Simon Martin wrote: Hi Jason, On 6 Nov 2024, at 20:47, Jason Merrill wrote: On 11/6/24 2:23 PM, Simon Martin wrote: Even though this PR is very close to PR117101, it's not addressed by the fix I made through r15-4958-g5821f5c8c89a05 because cxx_placement_new_fn has the ve

Re: [PATCH] testsuite: arm: fast-math-complex-add-half-float.c test should not xfail

2024-11-11 Thread Torbjorn SVENSSON
On 2024-11-11 13:43, Richard Biener wrote: On Sun, Nov 10, 2024 at 2:55 PM Torbjörn SVENSSON wrote: Ok for trunk? OK Thanks Richard. Pushed as r15-5104-ga2467372e72. Kind regards, Torbjörn -- With the change in 15-3128-gde1923f9f4d, this test case no longer xfail. gcc/testsuite/

[PATCH] RISC-V: testsuite: Remove deprecated compatibility headers

2024-11-11 Thread Edwin Lu
Since r15-4981-g5c34f02ba7e these tests have been failing on vector targets with excess errors due to the new deprecation warning message. Remove the header. gcc/testsuite/ChangeLog: * g++.target/riscv/rvv/base/bug-10.C: Remove cstdalign header. * g++.target/riscv/rvv/base/bug-11

Re: [PATCH v3] c: add Wzero-as-null-pointer-constant [PR117059]

2024-11-11 Thread Joseph Myers
On Mon, 11 Nov 2024, Martin Uecker wrote: > This patch enables the Wzero-as-null-pointer-constant for C. > The third version adds more tests. The various tests for boolean operations should also test those for pointers that are *not* null pointers (to verify that the implicit 0 being compared

Re: [PATCH v2 1/4] aarch64: return scalar fp8 values in fp registers

2024-11-11 Thread Richard Sandiford
Claudio Bantaloukas writes: > According to the aapcs64: If the argument is an 8-bit (...) precision > Floating-point or short vector type and the NSRN is less than 8, then the > argument is allocated to the least significant bits of register v[NSRN]. > > gcc/ > * config/aarch64/aarch64.cc >

[PATCH v3] c: add Wzero-as-null-pointer-constant [PR117059]

2024-11-11 Thread Martin Uecker
This patch enables the Wzero-as-null-pointer-constant for C. The third version adds more tests. Bootstrapped and regression tested on x86_64. commit 3afa3065be59374389daebfb32490fb93ad63d88 Author: Martin Uecker Date: Sat Nov 9 10:48:52 2024 +0100 c: add Wzero-as-null-pointer-constan

Re: [PATCH] i386: Add -mveclibabi=aocl [PR56504]

2024-11-11 Thread Jan Hubicka
> We currently support generating vectorized math calls to the AMD core > math library (ACML) (-mveclibabi=acml). That library is end-of-life and > its successor is the math library from AMD Optimizing CPU Libraries > (AOCL). > > This patch adds support for AOCL (-mveclibabi=aocl). That signific

Re: [PATCH, OpenACC 2.7] Connect readonly modifier to points-to analysis

2024-11-11 Thread Chung-Lin Tang
On 2024/5/16 8:36 PM, Richard Biener wrote: >> After omp-expand (before SSA): >> >> __attribute__((oacc parallel, omp target entrypoint, noclone)) >> void main._omp_fn.1 (const struct .omp_data_t.3 & restrict .omp_data_i) >> { >> ... >>: >> D.2962 = .omp_data_i->D.2947; >> a.8 = D.2962; >

Re: [PATCHv2 0/3] ada: Add GNU/Hurd x86_64 support

2024-11-11 Thread Samuel Thibault
Hello, Marc Poulhiès, le lun. 04 nov. 2024 16:28:43 +0100, a ecrit: > Samuel Thibault writes: > > > I reworked the patch to factorize the bsd signal definitions. > > I have split off the system definitions because the priority range of > > GNU/Mach has diverged from the original BSD kernels. >

Re: [PATCH 2/2] Add X86_TUNE_AVX512_TWO_EPILOGUES, enable for Zen4 and Zen5

2024-11-11 Thread Jan Hubicka
> The following adds X86_TUNE_AVX512_TWO_EPILOGUES tuning and directs the > vectorizer to produce both a vector AVX2 and SSE epilogue for AVX512 > vectorized loops when set. The tuning is enabled by default for Zen4 > and Zen5 where I benchmarked it to be overall positive on SPEC CPU 2017 both > i

Re: [PATCH v2 0/4] aarch64: add minimal support of AEABI build attributes for GCS

2024-11-11 Thread Matthieu Longo
On 2024-10-23 17:43, Richard Sandiford wrote: Matthieu Longo writes: The primary focus of this patch series is to add support for build attributes in the context of GCS (Guarded Control Stack, an Armv9.4-a extension) to the AArch64 backend. It addresses comments from revision 1 [2] and 2 [3],

[pushed] c++: reduce unnecessary tree_common

2024-11-11 Thread Jason Merrill
Tested x86_64-pc-linux-gnu, applying to trunk. -- 8< -- Lewis' r15-5067 fixing the marking of TRAIT_EXPR led me to compare some other front-end type definitions to their marking in cp_common_init_ts; it seems we can change tree_common to something smaller in several cases, to match how they are m

Re: [PATCH v2 3/4] aarch64: specify fpm mode in function instances and groups

2024-11-11 Thread Richard Sandiford
Claudio Bantaloukas writes: > diff --git a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def > b/gcc/config/aarch64/aarch64-sve-builtins-sve2.def > index e4021559f36..8d25bb33dad 100644 > --- a/gcc/config/aarch64/aarch64-sve-builtins-sve2.def > +++ b/gcc/config/aarch64/aarch64-sve-builtins-sve2.de

Re: [PATCH v2 2/4] aarch64: Add basic svmfloat8_t support to arm_sve.h

2024-11-11 Thread Richard Sandiford
Claudio Bantaloukas writes: > [...] > @@ -231,12 +231,12 @@ CONSTEXPR const group_suffix_info group_suffixes[] = { > #define TYPES_all_arith(S, D) \ >TYPES_all_float (S, D), TYPES_all_integer (S, D) > > -/* _bf16 > +/* _mf8 _bf16 > _f16 _f32 _f64 > _s8 _s16 _s32 _s64 > _u

[PATCH] c: Handle C23 floating constant {d,D}{32,64,128} suffixes like {df,dd,dl}

2024-11-11 Thread Jakub Jelinek
Hi! C23 roughly says that {d,D}{32,64,128} floating point constant suffixes are alternate spellings of {df,dd,dl} suffixes in annex H. So, the following patch allows that alternate spelling. Or is it intentional it isn't enabled and we need to do everything in there first before trying to define

Re: [PATCH v2] c: add Wzero-as-null-pointer-constant [PR117059]

2024-11-11 Thread Joseph Myers
Other cases to test: the != operator (should warn in same cases as ==); boolean uses of pointers such as if (p), !p, p?x:y, and implicit or explicit conversion to bool (none of those boolean uses of pointers should warn since there's no explicit integer null pointer constant involved in the imp

[PATCH] SVE intrinsics: Fold svmul and svdiv by -1 to svneg for unsigned types

2024-11-11 Thread Jennifer Schmitz
As follow-up to https://gcc.gnu.org/pipermail/gcc-patches/2024-October/665472.html, this patch implements folding of svmul and svdiv by -1 to svneg for unsigned SVE vector types. The key idea is to reuse the existing code that does this fold for signed types and feed it as callback to a helper func

[Patch] OpenMP: 'interop' construct - add C parser support, improve Fortran pasing

2024-11-11 Thread Tobias Burnus
Background: omp interop device(1) init(prefer_type("cuda"), targetsync: obj) depend(inout: x) nowait … omp interop destroy(obj) initializes the omp_interop_t / integer(omp_interop_kind) variable for device '1' and (thanks to 'targetsync') creates a stream object. 'obj' can then be used

Re: [PATCH 1/3] aarch64: Add support for fp8 convert and scale

2024-11-11 Thread Richard Sandiford
writes: > The AArch64 FEAT_FP8 extension introduces instructions for conversion > and scaling. > > This patch introduces the following intrinsics: > 1. vcvt{1|2}_{bf16|high_bf16|low_bf16}_mf8_fpm. > 2. vcvt{q}_mf8_f16_fpm. > 3. vcvt_{high}_mf8_f32_fpm. > 4. vscale{q}_{f16|f32|f64}. > > We introduc

[PATCH] i386: Add -mveclibabi=aocl [PR56504]

2024-11-11 Thread Filip Kastl
Hi, Bootstrapped and regtested on x86_64 linux. I also tested that all the new calls can be linked with the AOCL LibM library. Ok to push? Thanks, Filip Kastl -- 8< -- We currently support generating vectorized math calls to the AMD core math library (ACML) (-mveclibabi=acml). That library

Re: [PATCH v2 3/4] aarch64: Add SVE support for simd clones [PR 96342]

2024-11-11 Thread Richard Sandiford
Victor Do Nascimento writes: > This patch finalizes adding support for the generation of SVE simd clones when > no simdlen is provided, following the ABI rules where the widest data type > determines the minimum amount of elements in a length agnostic vector. > > gcc/ChangeLog: > > * config/

[PATCH v2] c++: Fix another crash with invalid new operators [PR117463]

2024-11-11 Thread Simon Martin
Hi Jason, On 6 Nov 2024, at 20:47, Jason Merrill wrote: > On 11/6/24 2:23 PM, Simon Martin wrote: >> Even though this PR is very close to PR117101, it's not addressed by >> the >> fix I made through r15-4958-g5821f5c8c89a05 because >> cxx_placement_new_fn >> has the very same issue as std_place

[PATCH] RISC-V: Load VLS perm indices directly from memory.

2024-11-11 Thread Robin Dapp
Hi, instead of loading the permutation indices and using vmslt in order to determine which elements belong to which source vector we can compute the proper mask at compile time. That way we can emit vlm instead of vle + vmslt. Regtested on rv64gcv. Regards Robin gcc/ChangeLog: * conf

[Patch] libgomp.c-c++-common/pr109062.c: Fix expected spin count for hybrid x86

2024-11-11 Thread Tobias Burnus
I intent to commit the attached patch later today as obvious. It is an older issue I never properly investigated before but finally want to fix it. The issue is that on systems (like my laptop) that have Intel's E and P cores (hybrid x86); testing by Intel has shown that spincount=1 is actually

Fwd: [PATCH v3] C: Support Function multiversionsing in the C front end

2024-11-11 Thread Alfie Richards
Adding missing CC's Forwarded Message Subject: Re: [PATCH v3] C: Support Function multiversionsing in the C front end Date: Mon, 11 Nov 2024 12:29:57 + From: Alfie Richards To: Joseph Myers Hi Joseph, Thank you for the detailed feedback. I am quite junior a

Re: [PATCH 07/10] aarch64: Add testcase for C/C++ ops on SVE ACLE types.

2024-11-11 Thread Richard Sandiford
Tejas Belagod writes: > On 11/7/24 4:52 PM, Richard Sandiford wrote: >> Tejas Belagod writes: >>> This patch adds a test case to cover C/C++ operators on SVE ACLE types. >>> This >>> does not cover all types, but covers most representative types. >>> >>> gcc/testsuite: >>> >>> * gcc.target/

Re: [PATCH] AArch64: Cleanup fusion defines

2024-11-11 Thread Richard Sandiford
Wilco Dijkstra writes: > Cleanup the fusion defines by introducing AARCH64_FUSE_BASE as a common base > level of fusion supported by almost all cores. Add AARCH64_FUSE_MOVK as a > shortcut for all MOVK fusion. In most cases there is no change. It enables > AARCH64_FUSE_CMP_BRANCH for a few olde

Re: [PATCH] AArch64: Remove duplicated addr_cost tables

2024-11-11 Thread Richard Sandiford
Wilco Dijkstra writes: > Remove duplicated addr_cost tables - use generic_armv9_a_addrcost_table for > Armv9-a cores and generic_armv8_a_addrcost_table for recent Armv8-a cores. > No changes in generated code. > > OK for commit? > > gcc/ChangeLog: > > * config/aarch64/tuning_models/cortexx92

[PATCH] c: Implement C2Y N3298 - Introduce complex literals [PR117029]

2024-11-11 Thread Jakub Jelinek
Hi! The following patch implements the C2Y N3298 paper Introduce complex literals by providing different (or no) diagnostics on imaginary constants (except for integer ones). For _DecimalN constants we don't support _Complex _DecimalN and error on any i/j suffixes mixed with DD/DL/DF, so nothing c

[PATCH][v2] tree-optimization/117484 - issue with SLP discovery of permuted .MASK_LOAD

2024-11-11 Thread Richard Biener
When we do SLP discovery of a .MASK_LOAD for a dataref group with gaps the discovery for the mask will have gaps as well and this was unexpected in a few places. The following re-organizes things slightly to accomodate for this. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Re: [PATCH] testsuite: arm: fast-math-complex-add-half-float.c test should not xfail

2024-11-11 Thread Richard Biener
On Sun, Nov 10, 2024 at 2:55 PM Torbjörn SVENSSON wrote: > > Ok for trunk? OK > -- > > With the change in 15-3128-gde1923f9f4d, this test case no longer xfail. > > gcc/testsuite/ChangeLog: > > * gcc.dg/vect/complex/fast-math-complex-add-half-float.c: Remove > xfail from test. > >

[PATCH 2/2] Add X86_TUNE_AVX512_TWO_EPILOGUES, enable for Zen4 and Zen5

2024-11-11 Thread Richard Biener
The following adds X86_TUNE_AVX512_TWO_EPILOGUES tuning and directs the vectorizer to produce both a vector AVX2 and SSE epilogue for AVX512 vectorized loops when set. The tuning is enabled by default for Zen4 and Zen5 where I benchmarked it to be overall positive on SPEC CPU 2017 both in performa

[PATCH 1/2] Add suggested_epilogue_mode to vector costs

2024-11-11 Thread Richard Biener
The following enables targets to suggest the vector mode to be used preferably for the epilogue of a vectorized loop. The patch also enables more than one vectorized epilogue in case the target suggests a vector mode for the epilogue of a vector epilogue. Bootstrapped and tested on x86_64-unknown

[PATCH] tree-optimization/117510 - fix guard hoisting validity check

2024-11-11 Thread Richard Biener
For the loop in the testcase we currently fail to hoist the guard check of the inner loop (m > 0) out of the outer loop because find_loop_guard checks all blocks of the outer loop for side-effects, including those that are skipped by the guard. This usually is harmless as the guard does not skip a

[PATCH v2] MATCH: Simplify `min(a, b) op max(a, b)` to `a op b` [PR109401]

2024-11-11 Thread Eikansh Gupta
This patch simplify `min(a,b) op max(a,b)` to `a op b`. This optimization will work for all the binary commutative operations. So, the `op` here can be one of {plus, mult, bit_and, bit_xor, bit_ior, eq, ne, min, max}. PR tree-optimization/109878 PR 109401 gcc/ChangeLog: *

Re: [Patch] libgomp/plugin/plugin-gcn.c: Show device number in ISA error

2024-11-11 Thread Tobias Burnus
Hi Andrew, Andrew Stubbs wrote: I think I'd prefer […] So, brackets instead of "of", and explain how to fix both possible issues. Disabling the device will also allow host-fallback to work, which might be the right thing for some end-users. Done so in commit r15-5080-g8473010807a264. Thanks

[PATCH v4] MATCH: Simplify `a rrotate (32-b) -> a lrotate b` [PR109906]

2024-11-11 Thread Eikansh Gupta
The pattern `a rrotate (32-b)` should be optimized to `a lrotate b`. The same is also true for `a lrotate (32-b)`. It can be optimized to `a rrotate b`. This patch adds following patterns: a rrotate (32-b) -> a lrotate b a lrotate (32-b) -> a rrotate b Bootstrapped and tested on x86_64-linux-gnu

Ping^2: [PATCH] doc/cpp: Document __has_include_next

2024-11-11 Thread Arsen Arsenović
Gentle ping on this patch again. TIA, have a lovely day. -- Arsen Arsenović signature.asc Description: PGP signature

Re: [PATCH] doc: mention STAGE1_CFLAGS

2024-11-11 Thread Mark Wielaard
Hi Sam, On Mon, 2024-11-11 at 07:54 +, Sam James wrote: > STAGE1_CFLAGS can be used to accelerate the just-built stage1 compiler > which especially improves its performance on some of the large generated > files during bootstrap. It defaults to nothing (i.e. -O0). > > The downside is that if

Re: [Patch] libgomp/plugin/plugin-gcn.c: Show device number in ISA error

2024-11-11 Thread Andrew Stubbs
On 11/11/2024 09:42, Tobias Burnus wrote: Currently, for GCN, only one offload ISA is supported; this might lead to errors when multiple different AMD GPUs are installed on the same system, at least when using the "wrong" device/device number. In case of the testsuite, this occurs for instance w

Re: [PATCH] s390: Add expander for uaddc/usubc optabs

2024-11-11 Thread Andreas Krebbel
Hi Stefan, thanks for the patch and sorry for the slow review. On 9/18/24 19:25, Stefan Schulze Frielinghaus wrote: Bootstrapped and regtested on s390. Both expander are constrained to z196 because of the conditional moves. I guess this is reasonable nowadays. The reason for the limitation

[PATCH][v2] tree-optimization/117502 - VMAT_STRIDED_SLP vs VMAT_ELEMENTWISE when considering gather

2024-11-11 Thread Richard Biener
The following treats both the same when considering to use gather or scatter for single-element interleaving accesses. This will cause FAIL: gcc.target/aarch64/sve/sve_iters_low_2.c scan-tree-dump-not vect "LOOP VECTORIZED" where we now vectorize the loop with VNx4QI, I'll leave it to ARM folks

[Patch] libgomp/plugin/plugin-gcn.c: Show device number in ISA error

2024-11-11 Thread Tobias Burnus
Currently, for GCN, only one offload ISA is supported; this might lead to errors when multiple different AMD GPUs are installed on the same system, at least when using the "wrong" device/device number. In case of the testsuite, this occurs for instance with libgomp.c-c++-common/icv-9.c which iter

[PATCH][v3] Add missing SLP discovery for CFN[_MASK][_LEN]_SCATTER_STORE

2024-11-11 Thread Richard Biener
This was responsible for a bunch of SVE FAILs with --param vect-force-slp=1 Bootstrapped and tested on x86_64-unknown-linux-gnu. * tree-vect-slp.cc (arg1_arg3_map): New. (arg1_arg3_arg4_map): Likewise. (vect_get_operand_map): Handle IFN_SCATTER_STORE, IFN_MASK_SCAT

[Patch, fortran] PR116388 - [13/14/15 regression] Finalizer called on uninitialized components of intent(out) argument

2024-11-11 Thread Paul Richard Thomas
'Obvious' patch committed as r15-5078-g42a2df0b7985b2a4732ba1c29726ac7aabd5eeae. Will backport later Thanks to Tomas Trnka for investigating identifying the fix. Regards Paul

Re: [Patch, fortran] PR109345 - [12/13/14/15 Regression] class(*) variable that is a string array is not handled correctly

2024-11-11 Thread Paul Richard Thomas
> > Hi Harald, > Thanks for the review! > ... except that the PR number should be corrected (109345 instead of > 109435) in the testcase and the commit message (Change.logs). > > Dyslexics of the world untie! Paul

Re: [PATCH v2] aarch64: Optimise calls to ldexp with SVE FSCALE instruction [PR111733]

2024-11-11 Thread Soumya AR
Hi Richard, > On 7 Nov 2024, at 3:19 PM, Richard Sandiford > wrote: > > External email: Use caution opening links or attachments > > > Soumya AR writes: >> Changes since v1: >> >> This revision makes use of the extended definition of aarch64_ptrue_reg to >> generate predicate registers wi

Re: [PATCH v1] RISC-V: Fix one nit indent issue of ustrunc pattern [NFC]

2024-11-11 Thread 钟居哲
LGTM. juzhe.zh...@rivai.ai From: pan2.li Date: 2024-11-11 16:45 To: gcc-patches CC: juzhe.zhong; kito.cheng; jeffreyalaw; rdapp.gcc; Pan Li Subject: [PATCH v1] RISC-V: Fix one nit indent issue of ustrunc pattern [NFC] From: Pan Li Just notice the indent is not that right for ustrunc pattern

Re: [PATCH v3] i386: Zero extend 32-bit address to 64-bit with option -mx32 -maddress-mode=long. [PR 117418]

2024-11-11 Thread Uros Bizjak
On Mon, Nov 11, 2024 at 2:39 AM Hu, Lin1 wrote: > > > OK, added check for target. > > Bootstrapped and Regtested on x86-64-linux-pc-gnu, OK for trunk? > > BRs, > Lin > > -maddress-mode=long let Pmode = DI_mode, so zero extend 32-bit address to > 64-bit and uses a 64-bit register as a pointer for a

Re: [patch, Fortran] Reject UNSIGNED for COMPLEX

2024-11-11 Thread Thomas Koenig
Am 10.11.24 um 21:54 schrieb Harald Anlauf: Hi Thomas, the patch is basically fine. I am wondering if we should create a new helper function that is the opposite of type_check ("type_cannot_be"), so that we avoid redundant code at the source level.  It may not be worth it yet, so your choice.

[PATCH v1 1/2] Revert "Match: Simplify branch form 3 of unsigned SAT_ADD into branchless"

2024-11-11 Thread pan2 . li
From: Pan Li This reverts commit df4af89bc3eabbeaccb16539aa1082cb9863e187. --- gcc/match.pd | 11 --- .../gcc.dg/tree-ssa/sat_u_add-simplify-1-u16.c| 15 --- .../gcc.dg/tree-ssa/sat_u_add-simplify-1-u32.c| 15 --- .../g

Re: [PATCH] Match: Fold pow calls to ldexp when possible [PR57492]

2024-11-11 Thread Soumya AR
Hi Richard, > On 7 Nov 2024, at 6:10 PM, Richard Biener wrote: > > External email: Use caution opening links or attachments > > > On Tue, 5 Nov 2024, Soumya AR wrote: > >> >> >>> On 29 Oct 2024, at 7:16 PM, Richard Biener wrote: >>> >>> External email: Use caution opening links or attachm

[PATCH v1] RISC-V: Fix one nit indent issue of ustrunc pattern [NFC]

2024-11-11 Thread pan2 . li
From: Pan Li Just notice the indent is not that right for ustrunc pattern from the md files. Thus, make it correct. It is somehow very obvious and will commit it after next 48H if no more comments. gcc/ChangeLog: * config/riscv/autovec.md: Fix indent format issue. Signed-off-by: Pan

[PATCH v1 2/2] Match: Refactor the unsigned SAT_ADD match pattern [NFC]

2024-11-11 Thread pan2 . li
From: Pan Li This patch would like to refactor the unsigned SAT_ADD pattern by: * Extract type check outside. * Extract common sub pattern. * Re-arrange the related match pattern forms together. * Remove unnecessary helper pattern matches. The below test suites are passed for this patch. * The r

Re: [PATCH] c++: Add __builtin_operator_{new,delete} support

2024-11-11 Thread Jakub Jelinek
On Fri, Nov 08, 2024 at 06:40:16PM +0100, Jakub Jelinek wrote: > clang++ adds __builtin_operator_{new,delete} builtins which as documented > work similarly to ::operator {new,delete}, except that it is an error > if the called ::operator {new,delete} is not a replaceable global operator > and allow

[PATCH] c++, dyninit, v3: Optimize C++ dynamic initialization by constants into DECL_INITIAL adjustment [PR102876]

2024-11-11 Thread Jakub Jelinek
Hi! I'd like to ping the https://gcc.gnu.org/pipermail/gcc-patches/2022-January/588539.html patch. Previous mails on this topic https://gcc.gnu.org/pipermail/gcc-patches/2021-November/thread.html#583289 https://gcc.gnu.org/pipermail/gcc-patches/2021-December/585994.html As it has been a while, the

Re: [PATCH 07/10] aarch64: Add testcase for C/C++ ops on SVE ACLE types.

2024-11-11 Thread Tejas Belagod
On 11/7/24 4:52 PM, Richard Sandiford wrote: Tejas Belagod writes: This patch adds a test case to cover C/C++ operators on SVE ACLE types. This does not cover all types, but covers most representative types. gcc/testsuite: * gcc.target/aarch64/sve/acle/general/cops.c: New test. ---