Re: libgo: fix for C23 nullptr keyword

2024-10-18 Thread Ian Lance Taylor
On Thu, Oct 17, 2024 at 1:19 PM Joseph Myers wrote: > > Making GCC default to -std=gnu23 for C code produces Go test failures > because of C code used by Go that uses a variable called nullptr, which is > a keyword in C23. > > I've submitted this fix upstream at > https://github.com/golang/go/pull

[pushed: r15-4492] diagnostics: remove forward decl of json::value from diagnostic.h

2024-10-18 Thread David Malcolm
I believe this hasn't been necessary since r15-1413-gd3878c85f331c7. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Pushed to trunk as r15-4492-g83abdb041426b7. gcc/ChangeLog: * diagnostic.h (json::value): Remove forward decl. Signed-off-by: David Malcolm --- gcc/diagno

[pushed: r15-4491] diagnostics: add debug dump functions

2024-10-18 Thread David Malcolm
This commit expands on r15-3973-g4c7a58ac2617e2, which added debug "dump" member functiosn to pretty_printer and output_buffer. This followup adds "dump" member functions to diagnostic_context and diagnostic_format, extends the existing dump functions and adds indentation to make it much easier to

[PATCH] phiopt: do factor_out_conditional_operation for all phis [PR112418]

2024-10-18 Thread Andrew Pinski
Sometimes factor_out_conditional_operation can factor out an operation that causes a phi node to become the same element. Other times, we want to factor out a binary operator because it can improve code generation, an example is PR 110015 (openjpeg). Note this includes a heuristic to decide if fac

[committed] c: Fix -std=gnu23 -Wtraditional for () in function definitions

2024-10-18 Thread Joseph Myers
We don't yet have clear agreement on removing -Wtraditional (although it seems there is little to no use for most of the warnings therein), so fix the bug in its interaction with -std=gnu23 to continue progress on making -std=gnu23 the default while -Wtraditional remains under discussion. The warn

[PATCH] doc/cpp: Document __has_include_next

2024-10-18 Thread Arsen Arsenović
OK for trunk? Seems to build and render fine with makeinfo --info and --html. Typesetting it, I see overfull and underfull hboxes, but I suspect these were here for a while.. -- >8 -- While hacking on an unrelated change, I noticed that __has_include_next hasn't been documented at

Re: Fortran: Add range-based diagnostic

2024-10-18 Thread Jerry D
On 10/18/24 3:35 PM, Tobias Burnus wrote: This patch was motivated by David's talk at Cauldron – and by getting rather bad locations for some diagnostics, where I wanted to use the column number to ensure that all items are found. The main problem was a missing gobbling of spaces, but still rang

Re: [PATCH v16b 3/4] gcc/: Merge definitions of array_type_nelts_top

2024-10-18 Thread Joseph Myers
On Wed, 16 Oct 2024, Alejandro Colomar wrote: > There were two identical definitions, and none of them are available > where they are needed for implementing __nelementsof__. Merge them, and > provide the single definition in gcc/tree.{h,cc}, where it's available > for __nelementsof__, which will

Fortran: Add range-based diagnostic

2024-10-18 Thread Tobias Burnus
This patch was motivated by David's talk at Cauldron – and by getting rather bad locations for some diagnostics, where I wanted to use the column number to ensure that all items are found. The main problem was a missing gobbling of spaces, but still ranges are way nicer. As gfortran uses the comm

Re: [Patch] Fortran: Fix translatability of diagnostic strings

2024-10-18 Thread Jerry D
On 10/18/24 3:20 PM, Tobias Burnus wrote: *Patch ping* OK for trunk. Jerry Tobias Burnus wrote: I noticed that several diagnostic strings were not tagged as translatable. I fixed them by adding _ or G_ as prefix ( →gcc/ABOUT-GCC-NLS) and moved a single-use string to the message to make it

Re: [PATCH] diagnostics: libcpp: Improve locations for _Pragma lexing diagnostics [PR114423]

2024-10-18 Thread David Malcolm
On Fri, 2024-10-18 at 13:58 -0400, Lewis Hyatt wrote: > On Fri, Oct 18, 2024 at 11:25 AM David Malcolm > wrote: > > >    if (!pfile->cb.diagnostic) > > > abort (); > > > -  ret = pfile->cb.diagnostic (pfile, level, reason, richloc, > > > _(msgid), ap); > > > - > > > -  return ret; > > > +  if

Re: [Patch] Fortran: Fix translatability of diagnostic strings

2024-10-18 Thread Tobias Burnus
*Patch ping* Tobias Burnus wrote: I noticed that several diagnostic strings were not tagged as translatable. I fixed them by adding _ or G_ as prefix ( →gcc/ABOUT-GCC-NLS) and moved a single-use string to the message to make it more readable. One error message did not quit fit the pattern, hen

Re: [PATCH 4/7] RISC-V: Honour -mrvv-max-lmul in riscv_vector::expand_block_move

2024-10-18 Thread Robin Dapp
Hi Craig, thanks for working on this, it has been on my TODO list for a while. In general this looks reasonable to me. > + poly_uint64 mode_units; > /* Find the mode to use for the copy inside the loop - or the >sole copy, if there is no loop. */ > if (!need_lo

libbacktrace patch committed

2024-10-18 Thread Ian Lance Taylor
Because libbacktrace merges adjacent address ranges where possible, and because the GNU linker can deduplicate functions leaving debuginfo that refers to address ranges in other compilation units, it is possible for libbacktrace to have overlapping address ranges, in particular to overlap ranges wi

Re: [PATCH] match.pd: Add std::pow folding optimizations.

2024-10-18 Thread Andrew Pinski
On Fri, Oct 18, 2024 at 5:09 AM Jennifer Schmitz wrote: > > This patch adds the following two simplifications in match.pd: > - pow (1.0/x, y) to pow (x, -y), avoiding the division > - pow (0.0, x) to 0.0, avoiding the call to pow. > The patterns are guarded by flag_unsafe_math_optimizations, > !fl

Re: [PATCH v6] Target-independent store forwarding avoidance.

2024-10-18 Thread Jeff Law
On 10/18/24 3:57 AM, Konstantinos Eleftheriou wrote: From: kelefth This pass detects cases of expensive store forwarding and tries to avoid them by reordering the stores and using suitable bit insertion sequences. For example it can transform this: strbw2, [x1, 1] ldr x0,

Re: [PATCH] target: Fix asm codegen for vfpclasss* and vcvtph2* instructions

2024-10-18 Thread Antoni Boucher
Thanks for the review. Here's the updated patch. Le 2024-10-17 à 21 h 50, Hongtao Liu a écrit : On Fri, Oct 18, 2024 at 9:08 AM Antoni Boucher wrote: Hi. This is a patch for the bug 116725. I'm not sure if it is a good fix, but it seems to do the job. If you have suggestions for better commen

Re: [PATCH] diagnostics: libcpp: Improve locations for _Pragma lexing diagnostics [PR114423]

2024-10-18 Thread Lewis Hyatt
On Fri, Oct 18, 2024 at 11:25 AM David Malcolm wrote: > >if (!pfile->cb.diagnostic) > > abort (); > > - ret = pfile->cb.diagnostic (pfile, level, reason, richloc, > > _(msgid), ap); > > - > > - return ret; > > + if (pfile->diagnostic_override_loc && level != CPP_DL_NOTE) > > +{ > >

Re: pair-fusion: Assume alias conflict if common address reg changes [PR116783]

2024-10-18 Thread Richard Sandiford
Alex Coplan writes: > On 11/10/2024 14:30, Richard Biener wrote: >> On Fri, 11 Oct 2024, Richard Sandiford wrote: >> >> > Alex Coplan writes: >> > > Hi, >> > > >> > > As the PR shows, pair-fusion was tricking memory_modified_in_insn_p into >> > > returning false when a common base register (in t

Re: pair-fusion: Assume alias conflict if common address reg changes [PR116783]

2024-10-18 Thread Alex Coplan
On 18/10/2024 17:45, Richard Sandiford wrote: > Alex Coplan writes: > > On 11/10/2024 14:30, Richard Biener wrote: > >> On Fri, 11 Oct 2024, Richard Sandiford wrote: > >> > >> > Alex Coplan writes: > >> > > Hi, > >> > > > >> > > As the PR shows, pair-fusion was tricking memory_modified_in_insn_p

[committed] hppa: Fix up pa.opt.urls

2024-10-18 Thread John David Anglin
Regenerated pa.opt.urls. Dave --- hppa: Fix up pa.opt.urls 2024-10-18 John David Anglin gcc/ChangeLog: * config/pa/pa.opt.urls: Fix for -mlra. diff --git a/gcc/config/pa/pa.opt.urls b/gcc/config/pa/pa.opt.urls index 5b8bcebdd0d..5516332ead1 100644 --- a/gcc/config/pa/pa.opt.urls ++

Re: [PATCH 3/9] Simplify X /[ex] Y cmp Z -> X cmp (Y * Z)

2024-10-18 Thread Richard Sandiford
[+ranger folks, who I forgot to CC originally, sorry!] This patch applies X /[ex] Y cmp Z -> X cmp (Y * Z) when Y * Z is representable. The closest check for "is representable" on range operations seemed to be overflow_free_p. However, that is designed for testing existing operations and so take

Re: [WIP RFC] libstdc++: add module std

2024-10-18 Thread Iain Sandoe
> On 18 Oct 2024, at 14:38, Jason Merrill wrote: > > This patch is not ready for integration, but I'd like to get feedback on the > approach (and various specific questions below). > > -- 8< -- > > This patch introduces an installed source form of module std and std.compat. > To find them, w

[RFC/RFA] [PATCH v5 10/12] Verify detected CRC loop with symbolic execution and LFSR matching.

2024-10-18 Thread Mariam Arutunian
Symbolically execute potential CRC loops and check whether the loop actually calculates CRC (uses LFSR matching). Calculated CRC and created LFSR are compared on each iteration of the potential CRC loop. gcc/ * Makefile.in (OBJS): Add crc-verification.o. * crc-verification.cc: New file.

Re: [WIP RFC] libstdc++: add module std

2024-10-18 Thread Maciej Cencora
Hi, Thanks for working on this! > stdc++.h also doesn't include the eternally deprecated . There are some other deprecated facilities that I notice are included: and float_denorm_style, at least. It would be nice for L{E,}WG to clarify whether module std is intended to include interfaces that

arm: Improvements to arm_noce_conversion_profitable_p call [PR 116444]

2024-10-18 Thread Andre Vieira (lists)
Sorry for the delay, some other work popped up in between and this had some latent issues. They should all be addressed now in this new patch. When not dealing with the special armv8.1-m.main conditional instructions case make sure it uses the default_noce_conversion_profitable_p call to dete

[committed] hppa: Add LRA support

2024-10-18 Thread John David Anglin
Tested on hppa-unknown-linux-gnu and hppa64-hp-hpux11.11. Committed to trunk. Dave --- hppa: Add LRA support LRA is not enabled as default since there are some new test fails remaining to resolve. 2024-10-18 John David Anglin gcc/ChangeLog: PR target/113933 * config/pa/pa.

Re: pair-fusion: Assume alias conflict if common address reg changes [PR116783]

2024-10-18 Thread Alex Coplan
On 11/10/2024 14:30, Richard Biener wrote: > On Fri, 11 Oct 2024, Richard Sandiford wrote: > > > Alex Coplan writes: > > > Hi, > > > > > > As the PR shows, pair-fusion was tricking memory_modified_in_insn_p into > > > returning false when a common base register (in this case, x1) was > > > modifi

Re: [PATCH] diagnostics: libcpp: Improve locations for _Pragma lexing diagnostics [PR114423]

2024-10-18 Thread David Malcolm
On Fri, 2024-10-18 at 09:25 -0400, Lewis Hyatt wrote: > Hello- > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114423 > > The diagnostics we issue while lexing tokens from a _Pragma string > have > always come out at invalid locations. I had tried a couple years ago > to > fix this in a general

Re: [WIP RFC] libstdc++: add module std

2024-10-18 Thread Patrick Palka
On Fri, 18 Oct 2024, Jason Merrill wrote: > This patch is not ready for integration, but I'd like to get feedback on the > approach (and various specific questions below). > > -- 8< -- > > This patch introduces an installed source form of module std and std.compat. > To find them, we install a l

Re: [PATCH 3/7] RISC-V: Fix vector memcpy smaller LMUL generation

2024-10-18 Thread Jeff Law
On 10/18/24 7:12 AM, Craig Blackmore wrote: If riscv_vector::expand_block_move is generating a straight-line memcpy using a predicated store, it tries to use a smaller LMUL to reduce register pressure if it still allows an entire transfer. This happens in the inner loop of riscv_vector::expan

[PATCH 0/2] aarch64: Use standard names for saturating arithmetic

2024-10-18 Thread Akram Ahmad
Hi all, This patch series introduces standard names for scalar, Adv. SIMD, and SVE saturating arithmetic instructions in the aarch64 backend. Additional tests are added for unsigned saturating arithmetic, as well as to test that the auto-vectorizer correctly inserts NEON instructions or scalar in

[RFC/RFA] [PATCH v5 01/12] Implement internal functions for efficient CRC computation.

2024-10-18 Thread Mariam Arutunian
Add two new internal functions (IFN_CRC, IFN_CRC_REV), to provide faster CRC generation. One performs bit-forward and the other bit-reversed CRC computation. If CRC optabs are supported, they are used for the CRC computation. Otherwise, table-based CRC is generated. The supported data and CRC sizes

[PATCH] libstdc++: Avoid using std::__to_address with iterators

2024-10-18 Thread Jonathan Wakely
Do others agree with my reasoning below? The changes to implement the rule "use std::__niter_base before C++20 and use std::to_address after C++20" were easier than I expected. There weren't many places that were doing it "wrong" and needed to change. Tested x86_64-linux. -- >8 -- In r12-3935-g

[PATCH 1/2] aarch64: Use standard names for saturating arithmetic

2024-10-18 Thread Akram Ahmad
This renames the existing {s,u}q{add,sub} instructions to use the standard names {s,u}s{add,sub}3 which are used by IFN_SAT_ADD and IFN_SAT_SUB. The NEON intrinsics for saturating arithmetic and their corresponding builtins are changed to use these standard names too. Using the standard names for

[PATCH 2/2] aarch64: Use standard names for SVE saturating arithmetic

2024-10-18 Thread Akram Ahmad
Rename the existing SVE unpredicated saturating arithmetic instructions to use standard names which are used by IFN_SAT_ADD and IFN_SAT_SUB. gcc/ChangeLog: * config/aarch64/aarch64-sve.md: Rename insns gcc/testsuite/ChangeLog: * gcc/testsuite/gcc.target/aarch64/sve/saturating_ar

Re: [PATCH 1/7] libstdc++: Refactor std::uninitialized_{copy, fill, fill_n} algos [PR68350]

2024-10-18 Thread Jonathan Wakely
On Fri, 18 Oct 2024 at 15:24, Patrick Palka wrote: > > On Fri, 18 Oct 2024, Jonathan Wakely wrote: > > > On 16/10/24 21:39 -0400, Patrick Palka wrote: > > > On Tue, 15 Oct 2024, Jonathan Wakely wrote: > > > > +#if __cplusplus < 201103L > > > > + > > > > + // True if we can unwrap _Iter to get a p

Re: [PATCH 2/7] RISC-V: Fix uninitialized reg in memcpy

2024-10-18 Thread Jeff Law
On 10/18/24 7:12 AM, Craig Blackmore wrote: gcc/ChangeLog: * config/riscv/riscv-string.cc (expand_block_move): Replace `end` with `length_rtx` in gen_rtx_NE. Thanks. I've pushed this to the trunk. jeff

Re: [PATCH 1/7] RISC-V: Fix indentation in riscv_vector::expand_block_move [NFC]

2024-10-18 Thread Jeff Law
On 10/18/24 7:12 AM, Craig Blackmore wrote: gcc/ChangeLog: * config/riscv/riscv-string.cc (expand_block_move): Fix indentation. Thanks. Pushed to the trunk. Jeff

[RFC/RFA] [PATCH v5 09/12] Add symbolic execution support.

2024-10-18 Thread Mariam Arutunian
Gives an opportunity to execute the code on bit level, assigning symbolic values to the variables which don't have initial values. Supports only CRC specific operations. Example: uint8_t crc; uint8_t pol = 1; crc = crc ^ pol; during symbolic execution crc's value will be: crc(8), crc(7), ... crc

[RFC/RFA] [PATCH v5 11/12] Replace the original CRC loops with a faster CRC calculation.

2024-10-18 Thread Mariam Arutunian
After the loop exit an internal function call (CRC, CRC_REV) is added, and its result is assigned to the output CRC variable (the variable where the calculated CRC is stored after the loop execution). The removal of the loop is left to CFG cleanup and DCE. gcc/ * gimple-crc-optimization.cc

[RFC/RFA] [PATCH v5 07/12] aarch64: Add CRC built-ins test for the target AES.

2024-10-18 Thread Mariam Arutunian
gcc/testsuite/gcc.target/aarch64/ * crc-builtin-pmul64.c: New test. Signed-off-by: Mariam Arutunian --- .../gcc.target/aarch64/crc-builtin-pmul64.c | 61 +++ 1 file changed, 61 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/crc-builtin-pmul64.c diff --

[RFC/RFA][PATCH v5 03/12] RISC-V: Add CRC expander to generate faster CRC.

2024-10-18 Thread Mariam Arutunian
If the target is ZBC or ZBKC, it uses clmul instruction for the CRC calculation. Otherwise, if the target is ZBKB, generates table-based CRC, but for reversing inputs and the output uses bswap and brev8 instructions. Add new tests to check CRC generation for ZBC, ZBKC and ZBKB targets. gcc/

[RFC/RFA][PATCH v5 06/12] aarch64: Implement new expander for efficient CRC computation.

2024-10-18 Thread Mariam Arutunian
This patch introduces two new expanders for the aarch64 backend, dedicated to generate optimized code for CRC computations. The new expanders are designed to leverage specific hardware capabilities to achieve faster CRC calculations, particularly using the crc32, crc32c and pmull instructions when

[RFC/RFA][PATCH v5 02/12] Add built-ins and tests for bit-forward and bit-reversed CRCs.

2024-10-18 Thread Mariam Arutunian
This patch introduces new built-in functions to GCC for computing bit-forward and bit-reversed CRCs. These builtins aim to provide efficient CRC calculation capabilities. When the target architecture supports CRC operations (as indicated by the presence of a CRC optab), the builtins will utilize th

[RFC/RFA] [PATCH v5 04/12] RISC-V: Add CRC built-ins tests for the target ZBC.

2024-10-18 Thread Mariam Arutunian
gcc/testsuite/gcc.target/riscv/ * crc-builtin-zbc32.c: New file. * crc-builtin-zbc64.c: Likewise. Signed-off-by: Mariam Arutunian Mentored-by: Jeff Law --- .../gcc.target/riscv/crc-builtin-zbc32.c | 21 ++ .../gcc.target/riscv/crc-builtin-zbc64.c | 66 +++

[RFC/RFA] [PATCH v5 08/12] Add a new pass for naive CRC loops detection.

2024-10-18 Thread Mariam Arutunian
This patch adds a new compiler pass aimed at identifying naive CRC implementations, characterized by the presence of a loop calculating a CRC (polynomial long division). Upon detection of a potential CRC, the pass prints an informational message. Performs CRC optimization if optimization level is

[RFC/RFA][PATCH v5 05/12] i386: Implement new expander for efficient CRC computation.

2024-10-18 Thread Mariam Arutunian
This patch introduces two new expanders for the i386 backend, dedicated to generating optimized code for CRC computations. The new expanders are designed to leverage specific hardware capabilities to achieve faster CRC calculations, particularly using the pclmulqdq or crc32 instructions when suppor

[RFC/RFA][PATCH v5 00/12] CRC optimization.

2024-10-18 Thread Mariam Arutunian
Hello, This patch series is a respin of the following: https://gcc.gnu.org/pipermail/gcc-patches/2024-September/662961.html. Although I sent [PATCH v4 00/12] to the mailing list, it didn’t appear in the archives, so I've provided the link to the first patch ([PATCH v4 01/12]). The original patch s

[PATCH v2 2/8] ifn: Add else-operand handling.

2024-10-18 Thread Robin Dapp
This patch adds else-operand handling to the internal functions. gcc/ChangeLog: * internal-fn.cc (add_mask_and_len_args): Rename... (add_mask_else_and_len_args): ...to this and add else handling. (expand_partial_load_optab_fn): Use adjusted function. (expand_partia

Re: [PATCH 1/7] libstdc++: Refactor std::uninitialized_{copy, fill, fill_n} algos [PR68350]

2024-10-18 Thread Patrick Palka
On Fri, 18 Oct 2024, Patrick Palka wrote: > On Fri, 18 Oct 2024, Jonathan Wakely wrote: > > > On 16/10/24 21:39 -0400, Patrick Palka wrote: > > > On Tue, 15 Oct 2024, Jonathan Wakely wrote: > > > > +#if __cplusplus < 201103L > > > > + > > > > + // True if we can unwrap _Iter to get a pointer by

[PATCH v2 0/8] Add maskload else operand.

2024-10-18 Thread Robin Dapp
Hi, finally, after many distractions, v2 of this series. Main changes from v1: - Restrict to types/modes with padding thanks to Richi's suggestion. - Return an array of supported else values and let the vectorizer choose. - Undefined else value for GCN. Bootstrapped and regtested on Power10,

[PATCH v2 6/8] gcn: Add else operand to masked loads.

2024-10-18 Thread Robin Dapp
This patch adds an undefined else operand to the masked loads. gcc/ChangeLog: * config/gcn/predicates.md (maskload_else_operand): New predicate. * config/gcn/gcn-valu.md: Use new predicate. --- gcc/config/gcn/gcn-valu.md | 12 gcc/config/gcn/predicates.md |

[PATCH v2 8/8] RISC-V: Add else operand to masked loads [PR115336].

2024-10-18 Thread Robin Dapp
This patch adds else operands to masked loads. Currently the default else operand predicate accepts "undefined" (i.e. SCRATCH) as well as all-ones values. Note that this series introduces a large number of new RVV FAILs for riscv. All of them are due to us not being able to elide redundant vec_c

[PATCH v2 5/8] aarch64: Add masked-load else operands.

2024-10-18 Thread Robin Dapp
This adds zero else operands to masked loads and their intrinsics. I needed to adjust more than initially thought because we rely on combine for several instructions and a change in a "base" pattern needs to propagate to all those. For the lack of a better idea I used a function call property to s

Re: [PATCH 1/7] libstdc++: Refactor std::uninitialized_{copy, fill, fill_n} algos [PR68350]

2024-10-18 Thread Patrick Palka
On Fri, 18 Oct 2024, Jonathan Wakely wrote: > On 16/10/24 21:39 -0400, Patrick Palka wrote: > > On Tue, 15 Oct 2024, Jonathan Wakely wrote: > > > +#if __cplusplus < 201103L > > > + > > > + // True if we can unwrap _Iter to get a pointer by using > > > std::__niter_base. > > > + template > > > +

[PATCH v2 3/8] tree-ifcvt: Enforce zero else value after maskload.

2024-10-18 Thread Robin Dapp
When predicating a load we implicitly assume that the else value is zero. This matters in case the loaded value is padded (like e.g. a Bool) and we must ensure that the padding bytes are zero on targets that don't implicitly zero inactive elements. In order to formalize this this patch queries th

[PATCH v2 7/8] i386: Add else operand to masked loads.

2024-10-18 Thread Robin Dapp
This patch adds a zero else operand to masked loads, in particular the masked gather load builtins that are used for gather vectorization. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_special_args_builtin): Add else-operand handling. (ix86_expand_builtin): Ditt

[PATCH v2 4/8] vect: Add maskload else value support.

2024-10-18 Thread Robin Dapp
This patch adds an else operand to vectorized masked load calls. The current implementation adds else-value arguments to the respective target-querying functions that is used to supply the vectorizer with the proper else value. Right now, the only spot where a zero else value is actually enforced

[PATCH v2 1/8] docs: Document maskload else operand and behavior.

2024-10-18 Thread Robin Dapp
This patch amends the documentation for masked loads (maskload, vec_mask_load_lanes, and mask_gather_load as well as their len counterparts) with an else operand. gcc/ChangeLog: * doc/md.texi: Document masked load else operand. --- gcc/doc/md.texi | 63 ---

[committed] i386: Fix the order of operands in andn3 [PR117192]

2024-10-18 Thread Uros Bizjak
Fix the order of operands in andn3 expander to comply with the specification, where bitwise-complement applies to operand 2. PR target/117192 gcc/ChangeLog: * config/i386/mmx.md (andn3): Swap operand indexes 1 and 2 to comply with andn specification. gcc/testsuite/ChangeLog: *

[WIP RFC] libstdc++: add module std

2024-10-18 Thread Jason Merrill
This patch is not ready for integration, but I'd like to get feedback on the approach (and various specific questions below). -- 8< -- This patch introduces an installed source form of module std and std.compat. To find them, we install a libstdc++.modules.json file alongside libstdc++.so, which

[PATCH] diagnostics: libcpp: Improve locations for _Pragma lexing diagnostics [PR114423]

2024-10-18 Thread Lewis Hyatt
Hello- https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114423 The diagnostics we issue while lexing tokens from a _Pragma string have always come out at invalid locations. I had tried a couple years ago to fix this in a general way, but I think that ended up being too invasive a change to fix a prob

Re: [PATCH 2/9] Use get_nonzero_bits to simplify trunc_div to exact_div

2024-10-18 Thread Richard Biener
On Fri, 18 Oct 2024, Richard Sandiford wrote: > There are a limited number of existing rules that benefit from > knowing that a division is exact. Later patches will add more. OK. Thanks, Richard. > gcc/ > * match.pd: Simplify X / (1 << C) to X /[ex] (1 << C) if the > low C bits of

Re: [PATCH 9/9] Record nonzero bits in the irange_bitmask of POLY_INT_CSTs

2024-10-18 Thread Andrew MacLeod
That seems like a very reasonable place. Andrew On 10/18/24 08:11, Richard Biener wrote: On Fri, 18 Oct 2024, Richard Sandiford wrote: At the moment, ranger punts entirely on POLY_INT_CSTs. Numerical ranges are a bit difficult, unless we do start modelling bounds on the indeterminates. But

[PATCH 1/7] RISC-V: Fix indentation in riscv_vector::expand_block_move [NFC]

2024-10-18 Thread Craig Blackmore
gcc/ChangeLog: * config/riscv/riscv-string.cc (expand_block_move): Fix indentation. --- gcc/config/riscv/riscv-string.cc | 32 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv

[PATCH 5/7] RISC-V: Move vector memcpy decision making to separate function [NFC]

2024-10-18 Thread Craig Blackmore
This moves the code for deciding whether to generate a vectorized memcpy, what vector mode to use and whether a loop is needed out of riscv_vector::expand_block_move and into a new function riscv_vector::use_stringop_p so that it can be reused for other string operations. gcc/ChangeLog: *

Re: [PATCH] SVE intrinsics: Add fold_active_lanes_to method to refactor svmul and svdiv.

2024-10-18 Thread Jennifer Schmitz
> On 18 Oct 2024, at 10:46, Tamar Christina wrote: > > External email: Use caution opening links or attachments > > >> -Original Message- >> From: Richard Sandiford >> Sent: Thursday, October 17, 2024 6:05 PM >> To: Jennifer Schmitz >> Cc: gcc-patches@gcc.gnu.org; Kyrylo Tkachov ; T

[PATCH 2/7] RISC-V: Fix uninitialized reg in memcpy

2024-10-18 Thread Craig Blackmore
riscv_vector::expand_block_move contains a gen_rtx_NE that uses uninitialized reg rtx `end`. It looks like `length_rtx` was supposed to be used here. gcc/ChangeLog: * config/riscv/riscv-string.cc (expand_block_move): Replace `end` with `length_rtx` in gen_rtx_NE. --- gcc/config/

[PATCH 6/7] RISC-V: Make vectorized memset handle more cases

2024-10-18 Thread Craig Blackmore
`expand_vec_setmem` only generated vectorized memset if it fitted into a single vector store. Extend it to generate a loop for longer and unknown lengths. The test cases now use -O1 so that they are not sensitive to scheduling. gcc/ChangeLog: * config/riscv/riscv-string.cc (use_

[PATCH 7/7] RISC-V: Disable by pieces for vector setmem length > UNITS_PER_WORD

2024-10-18 Thread Craig Blackmore
For fast unaligned access targets, by pieces uses up to UNITS_PER_WORD size pieces resulting in more store instructions than needed. For example gcc.target/riscv/rvv/base/setmem-1.c:f1 built with `-O3 -march=rv64gcv -mtune=thead-c906`: ``` f1: vsetivlizero,8,e8,mf2,ta,ma vm

[PATCH 4/7] RISC-V: Honour -mrvv-max-lmul in riscv_vector::expand_block_move

2024-10-18 Thread Craig Blackmore
Unlike the other vector string ops, expand_block_move was using max LMUL m8 regardless of TARGET_MAX_LMUL. The check for whether to generate inline vector code for movmem has been moved from movmem to riscv_vector::expand_block_move to avoid maintaining multiple versions of similar logic. They al

[PATCH 3/7] RISC-V: Fix vector memcpy smaller LMUL generation

2024-10-18 Thread Craig Blackmore
If riscv_vector::expand_block_move is generating a straight-line memcpy using a predicated store, it tries to use a smaller LMUL to reduce register pressure if it still allows an entire transfer. This happens in the inner loop of riscv_vector::expand_block_move, however, the vmode chosen by this l

[PATCH 0/7] RISC-V: Vector memcpy/memset fixes and improvements

2024-10-18 Thread Craig Blackmore
The main aim of this patch series is to make inline vector memcpy respect -mrvv-max-lmul and to extend inline vector memset to be used in more cases. It includes some preparatory fixes and refactoring along the way. Craig Blackmore (7): RISC-V: Fix indentation in riscv_vector::expand_block_move

Re: [PATCH v2] arm: [MVE intrinsics] Fix support for predicate constants [PR target/114801]

2024-10-18 Thread Andre Vieira (lists)
Hi, This looks like an acceptable work around. We special case behavior that I'm not sure we can express in ways GCC can understand or will make use of, whilst at the same time we keep expressing behavior it does understand and can optimize. Nice idea! LGTM, needs maintainer approval though

Re: [PATCH 8/9] Try to simplify (X >> C1) * (C2 << C1) -> X * C2

2024-10-18 Thread Richard Biener
On Fri, 18 Oct 2024, Richard Sandiford wrote: > This patch adds a rule to simplify (X >> C1) * (C2 << C1) -> X * C2 > when the low C1 bits of X are known to be zero. As with the earlier > X >> C1 << (C2 + C1) patch, any single conversion is allowed between > the shift and the multiplication. OK.

Re: [PATCH 9/9] Record nonzero bits in the irange_bitmask of POLY_INT_CSTs

2024-10-18 Thread Richard Biener
On Fri, 18 Oct 2024, Richard Sandiford wrote: > At the moment, ranger punts entirely on POLY_INT_CSTs. Numerical > ranges are a bit difficult, unless we do start modelling bounds on > the indeterminates. But we can at least track the nonzero bits. OK unless Andrew knows a better proper place to

Re: [PATCH 7/9] Handle POLY_INT_CSTs in get_nonzero_bits

2024-10-18 Thread Richard Biener
On Fri, 18 Oct 2024, Richard Sandiford wrote: > This patch extends get_nonzero_bits to handle POLY_INT_CSTs, > The easiest (but also most useful) case is that the number > of trailing zeros in the runtime value is at least the number > of trailing zeros in each individual component. > > In princi

[PATCH] [5/n] remove trapv-*.c special-casing of gcc.dg/vect/ files

2024-10-18 Thread Richard Biener
The following makes -ftrapv explicit. * gcc.dg/vect/vect.exp: Remove special-casing of tests named trapv-* * gcc.dg/vect/trapv-vect-reduc-4.c: Add dg-additional-options -ftrapv. --- gcc/testsuite/gcc.dg/vect/trapv-vect-reduc-4.c | 2 +- gcc/testsuite/gcc.dg/vect/vect.exp

[PATCH] match.pd: Add std::pow folding optimizations.

2024-10-18 Thread Jennifer Schmitz
This patch adds the following two simplifications in match.pd: - pow (1.0/x, y) to pow (x, -y), avoiding the division - pow (0.0, x) to 0.0, avoiding the call to pow. The patterns are guarded by flag_unsafe_math_optimizations, !flag_trapping_math, !flag_errno_math, !HONOR_SIGNED_ZEROS, and !HONOR_I

Re: [PATCH 6/9] Try to simplify (X >> C1) << (C1 + C2) -> X << C2

2024-10-18 Thread Richard Biener
On Fri, 18 Oct 2024, Richard Sandiford wrote: > This patch adds a rule to simplify (X >> C1) << (C1 + C2) -> X << C2 > when the low C1 bits of X are known to be zero. > > Any single conversion can take place between the shifts. E.g. for > a truncating conversion, any extra bits of X that are pre

Re: [PATCH 5/9] Generalise ((X /[ex] A) +- B) * A -> X +- A * B rule

2024-10-18 Thread Richard Biener
On Fri, 18 Oct 2024, Richard Sandiford wrote: > match.pd had a rule to simplify ((X /[ex] A) +- B) * A -> X +- A * B > when A and B are INTEGER_CSTs. This patch extends it to handle the > case where the outer multiplication is by a factor of A, not just > A itself. It also handles addition and m

[PATCH] libstdc++: Improve 26_numerics/headers/cmath/types_std_c++0x_neg.cc

2024-10-18 Thread Jonathan Wakely
This test checks that the special functions in are not declared prior to C++17. But we can remove the target selector and allow it to be tested for C++17 and later, and add target selectors to the individual dg-error directives instead. Also rename the test to match what it actually tests. libst

[PATCH] libstdc++: Simplify C++98 std::vector::_M_data_ptr overload set

2024-10-18 Thread Jonathan Wakely
We don't need separate overloads for returning a const or non-const pointer. We can make the member function const and return a non-const pointer, and let `vector::data() const` convert it to const as needed. libstdc++-v3/ChangeLog: * include/bits/stl_vector.h (vector::_M_data_ptr): Remov

Re: [PATCH 4/9] Simplify (X /[ex] C1) * (C1 * C2) -> X * C2

2024-10-18 Thread Richard Biener
On Fri, 18 Oct 2024, Richard Sandiford wrote: OK. Thanks, Richard. > gcc/ > * match.pd: Simplify (X /[ex] C1) * (C1 * C2) -> X * C2. > > gcc/testsuite/ > * gcc.dg/tree-ssa/mulexactdiv-1.c: New test. > * gcc.dg/tree-ssa/mulexactdiv-2.c: Likewise. > * gcc.dg/tree-ssa/mulex

Re: [PATCH 1/9] Make more places handle exact_div like trunc_div

2024-10-18 Thread Richard Biener
On Fri, 18 Oct 2024, Richard Sandiford wrote: > I tried to look for places where we were handling TRUNC_DIV_EXPR > more favourably than EXACT_DIV_EXPR. > > Most of the places that I looked at but didn't change were handling > div/mod pairs. But there's bound to be others I missed... OK, but I'd

[PATCH] [4/n] remove wrapv-*.c special-casing of gcc.dg/vect/ files

2024-10-18 Thread Richard Biener
The following makes -fwrapv explicit. * gcc.dg/vect/vect.exp: Remove special-casing of tests named wrapv-* * gcc.dg/vect/wrapv-vect-7.c: Add dg-additional-options -fwrapv. * gcc.dg/vect/wrapv-vect-reduc-2char.c: Likewise. * gcc.dg/vect/wrapv-vect-reduc-2shor

[PATCH] [3/n] remove fast-math-*.c special-casing of gcc.dg/vect/ files

2024-10-18 Thread Richard Biener
The following makes -ffast-math explicit. * gcc.dg/vect/vect.exp: Remove special-casing of tests named fast-math-* * gcc.dg/vect/fast-math-bb-slp-call-1.c: Add dg-additional-options -ffast-math. * gcc.dg/vect/fast-math-bb-slp-call-2.c: Likewise. * gc

[PATCH] [2/n] remove no-vfa-*.c special-casing of gcc.dg/vect/ files

2024-10-18 Thread Richard Biener
The following makes --param vect-max-version-for-alias-checks=0 explicit. * gcc.dg/vect/vect.exp: Remove special-casing of tests named no-vfa-* * gcc.dg/vect/no-vfa-pr29145.c: Add dg-additional-options --param vect-max-version-for-alias-checks=0. * gcc.dg/ve

RE: [PATCH 2/2] Add a new permute optimization step in SLP

2024-10-18 Thread Richard Biener
On Thu, 17 Oct 2024, Tamar Christina wrote: > Hi Christoph, > > > -Original Message- > > From: Christoph Müllner > > Sent: Tuesday, October 15, 2024 3:57 PM > > To: gcc-patches@gcc.gnu.org; Philipp Tomsich ; > > Tamar > > Christina ; Richard Biener > > Cc: Jeff Law ; Robin Dapp ; > > C

[PATCH 3/9] Simplify X /[ex] Y cmp Z -> X cmp (Y * Z)

2024-10-18 Thread Richard Sandiford
This patch applies X /[ex] Y cmp Z -> X cmp (Y * Z) when Y * Z is representable. The closest check for "is representable" on range operations seemed to be overflow_free_p. However, that is designed for testing existing operations and so takes the definedness of signed overflow into account. Here

[PATCH 5/9] Generalise ((X /[ex] A) +- B) * A -> X +- A * B rule

2024-10-18 Thread Richard Sandiford
match.pd had a rule to simplify ((X /[ex] A) +- B) * A -> X +- A * B when A and B are INTEGER_CSTs. This patch extends it to handle the case where the outer multiplication is by a factor of A, not just A itself. It also handles addition and multiplication of poly_ints. (Exact division by a poly_i

[PATCH 1/9] Make more places handle exact_div like trunc_div

2024-10-18 Thread Richard Sandiford
I tried to look for places where we were handling TRUNC_DIV_EXPR more favourably than EXACT_DIV_EXPR. Most of the places that I looked at but didn't change were handling div/mod pairs. But there's bound to be others I missed... gcc/ * match.pd: Extend some rules to handle exact_div like

[PATCH 9/9] Record nonzero bits in the irange_bitmask of POLY_INT_CSTs

2024-10-18 Thread Richard Sandiford
At the moment, ranger punts entirely on POLY_INT_CSTs. Numerical ranges are a bit difficult, unless we do start modelling bounds on the indeterminates. But we can at least track the nonzero bits. gcc/ * value-query.cc (range_query::get_tree_range): Use get_nonzero_bits to populat

[PATCH 8/9] Try to simplify (X >> C1) * (C2 << C1) -> X * C2

2024-10-18 Thread Richard Sandiford
This patch adds a rule to simplify (X >> C1) * (C2 << C1) -> X * C2 when the low C1 bits of X are known to be zero. As with the earlier X >> C1 << (C2 + C1) patch, any single conversion is allowed between the shift and the multiplication. gcc/ * match.pd: Simplify (X >> C1) * (C2 << C1) -

[PATCH 6/9] Try to simplify (X >> C1) << (C1 + C2) -> X << C2

2024-10-18 Thread Richard Sandiford
This patch adds a rule to simplify (X >> C1) << (C1 + C2) -> X << C2 when the low C1 bits of X are known to be zero. Any single conversion can take place between the shifts. E.g. for a truncating conversion, any extra bits of X that are preserved by truncating after the shift are immediately lost

Re: [PATCH v16b 2/4] gcc/: Rename array_type_nelts => array_type_nelts_minus_one

2024-10-18 Thread Alejandro Colomar
On Fri, Oct 18, 2024 at 10:25:59AM GMT, Alejandro Colomar wrote: > Hi Joseph, > > On Wed, Oct 16, 2024 at 08:02:05PM GMT, Alejandro Colomar wrote: > > Hi Joseph, > > > > On Wed, Oct 16, 2024 at 05:21:39PM GMT, Joseph Myers wrote: > > > On Wed, 16 Oct 2024, Alejandro Colomar wrote: > > > > > > >

[PATCH 7/9] Handle POLY_INT_CSTs in get_nonzero_bits

2024-10-18 Thread Richard Sandiford
This patch extends get_nonzero_bits to handle POLY_INT_CSTs, The easiest (but also most useful) case is that the number of trailing zeros in the runtime value is at least the number of trailing zeros in each individual component. In principle, we could do this for coeffs 1 and above only, and then

[PATCH 4/9] Simplify (X /[ex] C1) * (C1 * C2) -> X * C2

2024-10-18 Thread Richard Sandiford
gcc/ * match.pd: Simplify (X /[ex] C1) * (C1 * C2) -> X * C2. gcc/testsuite/ * gcc.dg/tree-ssa/mulexactdiv-1.c: New test. * gcc.dg/tree-ssa/mulexactdiv-2.c: Likewise. * gcc.dg/tree-ssa/mulexactdiv-3.c: Likewise. * gcc.dg/tree-ssa/mulexactdiv-4.c: Likewise.

[PATCH 2/9] Use get_nonzero_bits to simplify trunc_div to exact_div

2024-10-18 Thread Richard Sandiford
There are a limited number of existing rules that benefit from knowing that a division is exact. Later patches will add more. gcc/ * match.pd: Simplify X / (1 << C) to X /[ex] (1 << C) if the low C bits of X are clear gcc/testsuite/ * gcc.dg/tree-ssa/cmpexactdiv-6.c: New

  1   2   >