Re: [PATCH] PR tree-optimization/101390: Vectorize modulo operator

2024-08-22 Thread Jennifer Schmitz
> On 23 Aug 2024, at 06:21, Andrew Pinski wrote: > > External email: Use caution opening links or attachments > > > On Thu, Aug 22, 2024 at 11:28 AM Andrew Pinski wrote: >> >> On Thu, Aug 22, 2024 at 4:12 AM Richard Biener wrote: >>> >>> On Thu, 22 Aug 2024, Jennifer Schmitz wrote: >>> >>

[PATCH] Fix test failure on powerpc targets

2024-08-22 Thread Bernd Edlinger
Apparently due to slightly different optimization levels not always both subroutines have multiple subranges, but having at least one such, and no lexical blocks is sufficient to prove that the fix worked. Q.E.D. So reduce the test expectations to only at least one inlined subroutine with multiple

Re: [PATCH] PR tree-optimization/101390: Vectorize modulo operator

2024-08-22 Thread Andrew Pinski
On Thu, Aug 22, 2024 at 11:28 AM Andrew Pinski wrote: > > On Thu, Aug 22, 2024 at 4:12 AM Richard Biener wrote: > > > > On Thu, 22 Aug 2024, Jennifer Schmitz wrote: > > > > > On 19 Aug 2024, at 21:02, Richard Sandiford > > > wrote: > > > > > > > > External email: Use caution opening links or at

[PATCH] testsuite: Fix vect-mod-var.c for division by 0 [PR116461]

2024-08-22 Thread Andrew Pinski
The testcase cc.dg/vect/vect-mod-var.c has an division by 0 which is undefined. On some targets (aarch64), the scalar and the vectorized version, the result of division by 0 is the same. While on other targets (x86), we get a SIGFAULT. On other targets (powerpc), the results are different. The fix

Re: [PATCHv4, expand] Add const0 move checking for CLEAR_BY_PIECES optabs

2024-08-22 Thread Hongtao Liu
On Fri, Aug 23, 2024 at 11:03 AM HAO CHEN GUI wrote: > > Hi Hongtao, > > 在 2024/8/23 9:47, Hongtao Liu 写道: > > On Thu, Aug 22, 2024 at 4:06 PM HAO CHEN GUI wrote: > >> > >> Hi Hongtao, > >> > >> 在 2024/8/21 11:21, Hongtao Liu 写道: > >>> r15-3058-gbb42c551905024 support const0 operand for movv16qi,

Re: [PATCHv4, expand] Add const0 move checking for CLEAR_BY_PIECES optabs

2024-08-22 Thread HAO CHEN GUI
Hi Hongtao, 在 2024/8/23 9:47, Hongtao Liu 写道: > On Thu, Aug 22, 2024 at 4:06 PM HAO CHEN GUI wrote: >> >> Hi Hongtao, >> >> 在 2024/8/21 11:21, Hongtao Liu 写道: >>> r15-3058-gbb42c551905024 support const0 operand for movv16qi, please >>> rebase your patch and see if there's still the regressions. >

Re: [PATCH] arm: Force flag_pic for FDPIC

2024-08-22 Thread Fangrui Song
On Mon, May 13, 2024 at 2:21 PM Fangrui Song wrote: > > On Mon, Mar 4, 2024 at 12:13 AM Fangrui Song wrote: > > > > From: Fangrui Song > > > > -fno-pic -mfdpic generated code is like regular -fno-pic, not suitable > > for FDPIC (absolute addressing for symbol references and no function > > descr

Re: [PATCH] rs6000: Fix PTImode handling in power8 swap optimization pass [PR116415]

2024-08-22 Thread Peter Bergner
On 8/22/24 4:39 AM, Kewen.Lin wrote: > on 2024/8/21 21:14, Peter Bergner wrote: >> - if (ALTIVEC_OR_VSX_VECTOR_MODE (mode) || mode == TImode) >> + if (ALTIVEC_OR_VSX_VECTOR_MODE (mode) || mode == TImode >> + || mode == PTImode) > > Maybe we can introduce a macro to t

Re: [PATCHv4, expand] Add const0 move checking for CLEAR_BY_PIECES optabs

2024-08-22 Thread Hongtao Liu
On Thu, Aug 22, 2024 at 4:06 PM HAO CHEN GUI wrote: > > Hi Hongtao, > > 在 2024/8/21 11:21, Hongtao Liu 写道: > > r15-3058-gbb42c551905024 support const0 operand for movv16qi, please > > rebase your patch and see if there's still the regressions. > > There's still regressions. The patch enables V16QI

Re: [PATCH 9/9] RISC-V: Add vslide1up/down pattern to expand_const_vector

2024-08-22 Thread 钟居哲
>> I'm not sure if it's profitable to replace a lmul8 load with 127 >> vslide1down.vx >> ops but we're being honest with the middle end when returning the # of insns >> we'll be emitting when costing... I think it's issue of dynamic LMUL cost model which only care about program SSA-based registe

Re: [PATCH 6/9] RISC-V: Emit costs for bool and stepped const vectors

2024-08-22 Thread 钟居哲
Nice Clean up! It's very reasonable to have a dedicated riscv-v.h juzhe.zh...@rivai.ai From: Patrick O'Neill Date: 2024-08-23 03:46 To: gcc-patches CC: rdapp.gcc; juzhe.zhong; kito.cheng; jeffreyalaw; gnu-toolchain; Patrick O'Neill Subject: [PATCH 6/9] RISC-V: Emit costs for bool and stepped

Re: [PATCH] c++/modules: Merge default arguments [PR99274]

2024-08-22 Thread Nathaniel Shead
On Thu, Aug 22, 2024 at 02:20:14PM -0400, Patrick Palka wrote: > On Mon, 12 Aug 2024, Nathaniel Shead wrote: > > > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? > > > > I tried to implement a remapping of the slots for TARGET_EXPRs for the > > FIXME but I wasn't able to work ou

Re: [PUSHED] testsuite: Fix gcc.dg/torture/pr116420.c for targets default unsigned char [PR116464]

2024-08-22 Thread Jeff Law
On 8/22/24 3:53 PM, Andrew Pinski wrote: This is an obvious fix to the gcc.dg/torture/pr116420.c testcase which simplier changes from plain `char` to `signed char` so it works on targets where plain char defaults to unsigned. Pushed as obvious after a quick test for aarch64-linux-gnu to make

Re: [PATCH v2] c++, coroutines: Separate allocator work from the ramp body build.

2024-08-22 Thread Iain Sandoe
> On 22 Aug 2024, at 21:27, Jason Merrill wrote: > > On 8/22/24 3:43 PM, Iain Sandoe wrote: >>> On 22 Aug 2024, at 17:47, Jason Merrill wrote: >>> On 8/22/24 12:35 PM, Iain Sandoe wrote: > >> +build_coroutine_frame_delete_expr (tree coro_fp, tree orig, tree >> frame_size, >> +

[PUSHED] testsuite: Fix gcc.dg/torture/pr116420.c for targets default unsigned char [PR116464]

2024-08-22 Thread Andrew Pinski
This is an obvious fix to the gcc.dg/torture/pr116420.c testcase which simplier changes from plain `char` to `signed char` so it works on targets where plain char defaults to unsigned. Pushed as obvious after a quick test for aarch64-linux-gnu to make sure the testcase passes now. PR te

[PATCH] toplevel: Error out if using --disable-libstdcxx with bootstrap [PR105474]

2024-08-22 Thread Andrew Pinski
Bootstrapping and using --disable-libstdcxx will cause a build failure deep in compiling stage2 so instead error out early in the toplevel configure so it is more user friendly. Bootstrapped and tested on x86_64-linux-gnu. Also made sure --disable-libstdcxx without --disable-bootstrap failed.

RE: [RFC] Support single lane SLP early break

2024-08-22 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Wednesday, August 21, 2024 12:12 PM > To: Tamar Christina > Cc: GCC Patches > Subject: Re: [RFC] Support single lane SLP early break > > On Tue, 20 Aug 2024, Tamar Christina wrote: > > > Hi, > > > > I've been working on a prototype of

Re: [PATCH 6/9] RISC-V: Emit costs for bool and stepped const vectors

2024-08-22 Thread Patrick O'Neill
On 8/22/24 12:46, Patrick O'Neill wrote: These cases are handled in the expander (riscv-v.cc:expand_const_vector). We need the vector builder to detect these cases so extract that out into a new riscv-v.h header file. gcc/ChangeLog: * config/riscv/riscv-v.cc (class rvv_builder): Move

[PATCH] Don't remove /usr/lib and /lib from when passing to the linker [PR97304/104707]

2024-08-22 Thread Andrew Pinski
With newer ld, the default search library path does not include /usr/lib nor /lib but the driver decides to not pass -L down to the link for these and then in some/most cases libc is not found. This code dates from at least 1992 and it is done in a way which is not safe and does not make sense. S

Re: [PATCH] Extend check-function-bodies to cover directives

2024-08-22 Thread H.J. Lu
On Thu, Aug 22, 2024 at 9:42 AM Richard Sandiford wrote: > > "H.J. Lu" writes: > > As PR target/116174 shown, we may need to verify the directive order. > > Extend check-function-bodies to cover directives. > > > > * gcc.target/i386/pr116174.c: Use check-function-bodies. > > * lib/sca

[PATCH v2] Extend check-function-bodies to allow label and directives

2024-08-22 Thread H.J. Lu
As PR target/116174 shown, we may need to verify labels and the directive order. Extend check-function-bodies to support matched output lines to allow label and directives. gcc/ * doc/sourcebuild.texi (check-function-bodies): Add an optional argument for matched output lines. gc

Re: [PATCH 3/9] RISC-V: Handle 0.0 floating point pattern costing to match const_vector expander

2024-08-22 Thread Robin Dapp
> + /* Constants in range -16 ~ 15 integer or 0.0 floating-point > +can be emitted using vmv.v.i. */ > + if (satisfies_constraint_vi (x) || satisfies_constraint_Wc0 (x)) > return 1; Just a nit but while you're at it, don't you want to split

Re: [PATCH 1/9] RISC-V: Use encoded nelts when calling repeating_sequence_p

2024-08-22 Thread Robin Dapp
Before looking at the rest (tomorrow) - this is OK. -- Regards Robin

Re: [PATCH v2] c++, coroutines: Separate allocator work from the ramp body build.

2024-08-22 Thread Jason Merrill
On 8/22/24 3:43 PM, Iain Sandoe wrote: On 22 Aug 2024, at 17:47, Jason Merrill wrote: On 8/22/24 12:35 PM, Iain Sandoe wrote: +build_coroutine_frame_delete_expr (tree coro_fp, tree orig, tree frame_size, + tree promise_type, location_t loc) +{ Here it seems

[PATCH 8/9] RISC-V: Move helper functions above expand_const_vector

2024-08-22 Thread Patrick O'Neill
These subroutines will be used in expand_const_vector in a future patch. Relocate so expand_const_vector can use them. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vector_init_insert_elems): Relocate. (expand_vector_init_trailing_same_elem): Ditto. Signed-off-by: Patrick O'Ne

[PATCH 7/9] RISC-V: Allow non-duplicate bool patterns in expand_const_vector

2024-08-22 Thread Patrick O'Neill
Currently we assert when encountering a non-duplicate boolean vector. This patch allows non-duplicate vectors to fall through to the gcc_unreachable and assert there. This will be useful when adding a catch-all pattern to emit costs and handle arbitary vectors. gcc/ChangeLog: * config/ri

[RFC] RISC-V: Add cost model asserts

2024-08-22 Thread Patrick O'Neill
Applies after the recent 9 patch series: "RISC-V: Improve const vector costing and expansion" https://inbox.sourceware.org/gcc-patches/20240822194705.2789364-1-patr...@rivosinc.com/T/#t This isn't functional due to RTX hash collisions. It was incredibly useful and helped me catch a few tricky bugs

[PATCH 6/9] RISC-V: Emit costs for bool and stepped const vectors

2024-08-22 Thread Patrick O'Neill
These cases are handled in the expander (riscv-v.cc:expand_const_vector). We need the vector builder to detect these cases so extract that out into a new riscv-v.h header file. gcc/ChangeLog: * config/riscv/riscv-v.cc (class rvv_builder): Move to riscv-v.h. * config/riscv/riscv.cc

[PATCH 5/9] RISC-V: Handle case when constant vector construction target rtx is not a register

2024-08-22 Thread Patrick O'Neill
This manifests in RTL that is optimized away which causes runtime failures in the testsuite. Update all patterns to use a temp result register if required. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Use tmp register if needed. Signed-off-by: Patrick O'Neill

[PATCH 4/9] RISC-V: Reorder insn cost match order to match corresponding expander match order

2024-08-22 Thread Patrick O'Neill
The corresponding expander (riscv-v.cc:expand_const_vector) matches const_vec_duplicate_p before const_vec_series_p. Reorder to match this behavior when calculating costs. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_const_insns): Relocate. Signed-off-by: Patrick O'Neill --- gcc/confi

[PATCH 1/9] RISC-V: Use encoded nelts when calling repeating_sequence_p

2024-08-22 Thread Patrick O'Neill
repeating_sequence_p operates directly on the encoded pattern and does not derive elements using the .elt() accessor. Passing in the length of the unencoded vector can cause an out-of-bounds read of the encoded pattern. gcc/ChangeLog: * config/riscv/riscv-v.cc (rvv_builder::can_duplicate

[PATCH 9/9] RISC-V: Add vslide1up/down pattern to expand_const_vector

2024-08-22 Thread Patrick O'Neill
Also explicitly disallow CONST_VECTOR_DUPLICATE_P for now. CONST_VECTOR_DUPLICATE_P was previously disallowed implicitly. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_vec_series): Update comment. (expand_vector_init_insert_elems): Ditto. (expand_const_vector): Add catc

[PATCH 3/9] RISC-V: Handle 0.0 floating point pattern costing to match const_vector expander

2024-08-22 Thread Patrick O'Neill
The comment previously here stated that the Wc0/Wc1 cases are handled by the vi constraint but that is not true for the 0.0 Wc0 case. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_const_insns): Handle 0.0 floating-point case. Signed-off-by: Patrick O'Neill --- gcc/config/riscv/

[PATCH 2/9] RISC-V: Fix vid const vector expander for non-npatterns size steps

2024-08-22 Thread Patrick O'Neill
Prior to this patch the expander would emit vectors like: { 0, 0, 5, 5, 10, 10, ...} as: { 0, 0, 2, 2, 4, 4, ...} This patch sets the step size to the requested value. gcc/ChangeLog: * config/riscv/riscv-v.cc (expand_const_vector): Fix STEP size in expander. Signed-off-by: Pat

[PATCH 0/9] RISC-V: Improve const vector costing and expansion

2024-08-22 Thread Patrick O'Neill
Constant vectors are currently spilled/loaded from memory often. This series increases the number of costed patterns via a catch-all pattern and fixes a variety of bugs I found along the way. Patrick O'Neill (9): RISC-V: Use encoded nelts when calling repeating_sequence_p RISC-V: Fix vid const

Re: [PATCH v2] c++, coroutines: Separate allocator work from the ramp body build.

2024-08-22 Thread Iain Sandoe
Hi Jason > On 22 Aug 2024, at 17:47, Jason Merrill wrote: > > On 8/22/24 12:35 PM, Iain Sandoe wrote: >> Hi Jason, >> Firstly, Arsen has WIP to revise the allocation / deallocation to deal with >> coroutine frames that are more aligned than 2 * sizeof (pointer). We will >> also >> be consideri

[committed][PR rtl-optimization/116420] Fix interesting block bitmap DF dataflow

2024-08-22 Thread Jeff Law
The DF framework provides us a way to run dataflow problems on sub-graphs. Naturally a bitmap of interesting blocks is passed into those routines. At a confluence point, the DF framework will not mark a block for re-processing if it's not in that set of interesting blocks. When ext-dce sets

Re: [PATCH] PR tree-optimization/101390: Vectorize modulo operator

2024-08-22 Thread Andrew Pinski
On Thu, Aug 22, 2024 at 4:12 AM Richard Biener wrote: > > On Thu, 22 Aug 2024, Jennifer Schmitz wrote: > > > On 19 Aug 2024, at 21:02, Richard Sandiford > > wrote: > > > > > > External email: Use caution opening links or attachments > > > > > > > > > Jennifer Schmitz writes: > > >> Thanks for t

[patch][rfc] libgomp: Add OpenMP interop support to nvptx + gcn plugin

2024-08-22 Thread Tobias Burnus
This patch adds OpenMP's interop support to the libgomp plugins (nvptx: cuda, cuda_driver, hip; gcn: hip, hsa).* [The idea is that the user can ask OpenMP to return a foreign-runtime handle (CUdevice, hipCtx_t, …) for to a specified OpenMP device number – and to create a stream (CUstream, hipS

[PATCH] libcpp: bump padding size in _cpp_convert_input [PR116458]

2024-08-22 Thread Alexander Monakov
The recently introduced search_line_fast_ssse3 raised padding requirement from 16 to 64, which was adjusted in read_file_guts, but the corresponding ' + 16' in _cpp_convert_input was overlooked. libcpp/ChangeLog: PR preprocessor/116458 * charset.cc (_cpp_convert_input): Bump paddi

Re: [PATCH] c++/modules: Merge default arguments [PR99274]

2024-08-22 Thread Patrick Palka
On Mon, 12 Aug 2024, Nathaniel Shead wrote: > Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? > > I tried to implement a remapping of the slots for TARGET_EXPRs for the > FIXME but I wasn't able to work out how to do so effectively. Given > that I doubt this will be a common iss

RE: [PATCH] testsuite: Add -fwrapv to signbit-5.c

2024-08-22 Thread Tamar Christina
> -Original Message- > From: Torbjorn SVENSSON > Sent: Wednesday, August 21, 2024 2:23 PM > To: Tamar Christina ; Richard Biener > > Cc: Jeff Law ; gcc-patches@gcc.gnu.org; Richard > Earnshaw ; quic_apin...@quicinc.com; > yvan.r...@foss.st.com > Subject: Re: [PATCH] testsuite: Add -fwrapv

Re: [PATCH] c++: Check template parameter number in member class template specialization [PR115716]

2024-08-22 Thread Jason Merrill
On 8/22/24 12:51 PM, Simon Martin wrote: We currently ICE upon the following invalid code, because we don't check the number of template parameters in member class template specializations. This patch fixes the PR by adding such a check. === cut here === template struct x { template struct

Re: [PATCH v3 1/2] RISC-V: add option -m(no-)autovec-segment

2024-08-22 Thread Patrick O'Neill
Ping. I think the review for patch 2/2 (fixing the ICE) deferred review of this patch. Since this patch includes the testcase and everything needed for reproducing the ICE it should make it easier to debug [1]. This patch still applies so I've re-triggered precommit. Patrick [1]: https://in

[PATCH] c++: Check template parameter number in member class template specialization [PR115716]

2024-08-22 Thread Simon Martin
We currently ICE upon the following invalid code, because we don't check the number of template parameters in member class template specializations. This patch fixes the PR by adding such a check. === cut here === template struct x { template struct y { typedef T result2; }; }; template<

Re: [PATCH v2] c++, coroutines: Separate allocator work from the ramp body build.

2024-08-22 Thread Jason Merrill
On 8/22/24 12:35 PM, Iain Sandoe wrote: Hi Jason, Firstly, Arsen has WIP to revise the allocation / deallocation to deal with coroutine frames that are more aligned than 2 * sizeof (pointer). We will also be considering Lewis' P2014 (use of the aligned allocator). So this patch is very much a

Re: [PATCH] Extend check-function-bodies to cover directives

2024-08-22 Thread Richard Sandiford
"H.J. Lu" writes: > As PR target/116174 shown, we may need to verify the directive order. > Extend check-function-bodies to cover directives. > > * gcc.target/i386/pr116174.c: Use check-function-bodies. > * lib/scanasm.exp (configure_check-function-bodies): Add an > argument for

Re: [PATCH 3/9 v3] c++, coroutines: Separate the analysis, ramp and outlined function synthesis.

2024-08-22 Thread Jason Merrill
On 8/22/24 12:19 PM, Iain Sandoe wrote: Maybe use iloc_sentinel? It's a bit awkward because it won't set UNKNOWN_LOCATION, but avoids needing to restore it at the end of the function. I've used this - with the function start location - but I don't think that really solves the problem; it needs

Re: final: go down ASHIFT in walk_alter_subreg

2024-08-22 Thread Richard Sandiford
Michael Matz writes: > when experimenting with m68k plus LRA one of the > changes in the backend is to accept ASHIFTs (not only > MULT) as scale code for address indices. When then not > turning on LRA but using reload those addresses are > presented to it which chokes on them. While reload is >

[PATCH v2] c++, coroutines: Separate allocator work from the ramp body build.

2024-08-22 Thread Iain Sandoe
Hi Jason, Firstly, Arsen has WIP to revise the allocation / deallocation to deal with coroutine frames that are more aligned than 2 * sizeof (pointer). We will also be considering Lewis' P2014 (use of the aligned allocator). So this patch is very much a staging point. >> operator new for

Re: [PATCH] RISC-V: Expand vec abs without masking.

2024-08-22 Thread Kito Cheng
LGTM Robin Dapp 於 2024年8月23日 週五 00:04 寫道: > Hi, > > standard abs synthesis during expand is max (a, -a). This > expansion has the advantage of avoiding masking and is thus potentially > faster than the a < 0 ? -a : a synthesis. > > Regtested on rv64gcv_zvfh_zvbb. > > Regards > Robin > > gcc/Ch

[PATCH 3/9 v3] c++, coroutines: Separate the analysis, ramp and outlined function synthesis.

2024-08-22 Thread Iain Sandoe
>>>Maybe use iloc_sentinel? It's a bit awkward because it won't set >>>UNKNOWN_LOCATION, but avoids needing to restore it at the end of the >>>function. >>I've used this - with the function start location - but I don't think that >>really solves the problem; it needs to be UNKNOWN_LOCATION to st

Re: [PATCH v2] c++, coroutines: Tidy up awaiter variable checks.

2024-08-22 Thread Jason Merrill
On 8/22/24 4:38 AM, Iain Sandoe wrote: Hi Jason, + if (!glvalue_p (o) && !xvalue_p (o)) +o = build_target_expr_with_type (o, TREE_TYPE (o), tf_warning_or_error); Maybe get_target_expr instead? done. + o = cp_build_init_expr (loc, e_proxy, convert_from_reference (o)); Why con

Re: [PATCH 9/9] c++, coroutines: Look through initial_await target exprs [PR110635].

2024-08-22 Thread Jason Merrill
On 8/21/24 3:10 PM, Iain Sandoe wrote: In the case that the initial awaiter returns an object, the initial await can be a target expression and we need to look at its initializer to cast the await_resume() to void and to wrap in a compoun expression that sets the initial_await_resume_called flag.

[PATCH] RISC-V: Expand vec abs without masking.

2024-08-22 Thread Robin Dapp
Hi, standard abs synthesis during expand is max (a, -a). This expansion has the advantage of avoiding masking and is thus potentially faster than the a < 0 ? -a : a synthesis. Regtested on rv64gcv_zvfh_zvbb. Regards Robin gcc/ChangeLog: * config/riscv/autovec.md (abs2): Expand via ma

Re: [PATCH 8/9] c++, coroutines: Rework handling of throwing_cleanups [PR102051].

2024-08-22 Thread Jason Merrill
On 8/21/24 3:10 PM, Iain Sandoe wrote: In the fix for PR95822 (r11-7402) we set throwing_cleanup false in the top level of the coroutine transform code. However, as the current PR shows, that is not sufficient. Any use of cxx_maybe_build_cleanup() can reset the flag, which causes the check_retu

Re: [PATCH 7/9] c++, coroutines: Fix ordering of return object conversions [PR115908].

2024-08-22 Thread Jason Merrill
On 8/21/24 3:10 PM, Iain Sandoe wrote: [dcl.fct.def.coroutine]/7 says: The expression promise.get_return_object() is used to initialize the returned reference or prvalue result object of a call to a coroutine. The call to get_return_object is sequenced before the call to initial_suspend and is in

Re: [PATCH 5/9] c++, coroutines: Only allow void get_return_object if the ramp is void [PR100476].

2024-08-22 Thread Jason Merrill
On 8/22/24 11:42 AM, Jason Merrill wrote: On 8/21/24 3:10 PM, Iain Sandoe wrote: Require that the value returned by get_return_object is convertible to the ramp return.  This means that the only time we allow a void get_return_object, is when the ramp is also a void function. We diagnose this e

final: go down ASHIFT in walk_alter_subreg

2024-08-22 Thread Michael Matz
when experimenting with m68k plus LRA one of the changes in the backend is to accept ASHIFTs (not only MULT) as scale code for address indices. When then not turning on LRA but using reload those addresses are presented to it which chokes on them. While reload is going away the change to make the

LRA: Fix setup_sp_offset

2024-08-22 Thread Michael Matz
This is part of making m68k work with LRA. See PR116429. In short: setup_sp_offset is internally inconsistent. It wants to setup the sp_offset for newly generated instructions. sp_offset for an instruction is always the state of the sp-offset right before that instruction. For that it starts at

LRA: Don't use 0 as initialization for sp_offset

2024-08-22 Thread Michael Matz
this is part of making m68k work with LRA. See PR116374. m68k has the property that sometimes the elimation offset between %sp and %argptr is zero. During setting up elimination infrastructure it's changes between sp_offset and previous_offset that feed into insns_with_changed_offsets that ultima

Re: [PATCH 6/9] c++, coroutines: Allow convertible get_return_on_allocation_fail [PR109682].

2024-08-22 Thread Jason Merrill
On 8/21/24 3:10 PM, Iain Sandoe wrote: We have been requiring the get_return_on_allocation_fail() call to have the same type as the ramp. This is not intended by the standard, so relax that to allow anything convertible to the ramp return. OK. PR c++/109682 gcc/cp/ChangeLog:

Re: [PATCH 5/9] c++, coroutines: Only allow void get_return_object if the ramp is void [PR100476].

2024-08-22 Thread Jason Merrill
On 8/21/24 3:10 PM, Iain Sandoe wrote: Require that the value returned by get_return_object is convertible to the ramp return. This means that the only time we allow a void get_return_object, is when the ramp is also a void function. We diagnose this early to allow us to exit the ramp build if

[PATCH ver 3] rs6000,extend and document built-ins vec_test_lsbb_all_ones and vec_test_lsbb_all_zeros

2024-08-22 Thread Carl Love
Gcc maintainers: Version 3, fixed a few typos per Kewen's review.  Fixed the expected number of scan-assembler-times for xvtlsbb and setbc.  Retested on Power 10 LE. Version 2, based on discussion additional overloaded instances of the vec_test_lsbb_all_ones and, vec_test_lsbb_all_zeros buil

Re: [PATCH] optabs-query: Guard smallest_int_mode_for_size [PR115495].

2024-08-22 Thread Robin Dapp
> Why's the include needed? .ccs ought to include coretypes.h directly > (and get machmode.h that way, since coretypes.h include machmode.h). Ugh, that was not intentional, sometimes my auto-complete inserts such includes for no reason. I really need to disable that, thanks for pointing that out

Re: [PATCH] lra: Don't apply eliminations to allocated registers [PR116321]

2024-08-22 Thread Vladimir Makarov
On 8/22/24 04:44, Richard Sandiford wrote: The sequence of events in this PR is that: - the function has many addresses in which only a single hard base register is acceptable. Let's call the hard register H. - IRA allocates that register to one of the pseudo base registers. Let's call

Re: [PATCH ver 2] rs6000,extend and document built-ins vec_test_lsbb_all_ones and vec_test_lsbb_all_zeros

2024-08-22 Thread Carl Love
Kewen: On 8/20/24 12:56 AM, Kewen.Lin wrote: Hi Carl, on 2024/8/9 23:57, Carl Love wrote: Gcc maintainers: Version 2, based on discussion additional overloaded instances of the vec_test_lsbb_all_ones and, vec_test_lsbb_all_zeros built-ins has been added.  The additional instances are for

Re: [PATCH] Use add_name_and_src_coords_attributes in modified_type_die

2024-08-22 Thread Tom Tromey
> "Richard" == Richard Biener writes: >> While working on a patch to the Ada compiler, I found a spot in >> dwarf2out.cc that calls add_name_attribute where a call to >> add_name_and_src_coords_attributes would be better, because the latter >> respects DECL_NAMELESS. Richard> If the point is

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-22 Thread Qing Zhao
> On Aug 21, 2024, at 18:08, Bill Wendling wrote: > >>> >>> to test. >> >> For the unary operator __counted_by(PTR), “PTR” must have a counted_by >> attribute, if not, there will be a compilation time error. >> >> Then the user could write the following code: >> >> If __builtin_has_att

Re: [PATCH 4/9] c++, coroutines: Fix handling of early exceptions [PR113773].

2024-08-22 Thread Jason Merrill
On 8/21/24 3:10 PM, Iain Sandoe wrote: The responsibility for destroying part of the frame content (promise, arg copies and the frame itself) transitions from the ramp to the body of the coroutine once we reach the await_resume () for the initial suspend. We added the variable that flags the tra

Re: Re-compute TYPE_MODE and DECL_MODE while streaming in for accelerator

2024-08-22 Thread Richard Sandiford
Prathamesh Kulkarni writes: >> -Original Message- >> From: Richard Biener >> Sent: Wednesday, August 21, 2024 5:09 PM >> To: Prathamesh Kulkarni >> Cc: Richard Sandiford ; Thomas Schwinge >> ; gcc-patches@gcc.gnu.org >> Subject: RE: Re-compute TYPE_MODE and DECL_MODE while streaming in f

Re: [PATCH] optabs-query: Guard smallest_int_mode_for_size [PR115495].

2024-08-22 Thread Richard Sandiford
"Robin Dapp" writes: > diff --git a/gcc/machmode.h b/gcc/machmode.h > index c31ec2f2ebc..b3307ad9342 100644 > --- a/gcc/machmode.h > +++ b/gcc/machmode.h > @@ -20,6 +20,7 @@ along with GCC; see the file COPYING3. If not see > #ifndef HAVE_MACHINE_MODES > #define HAVE_MACHINE_MODES > > +#inclu

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-22 Thread Qing Zhao
> On Aug 21, 2024, at 17:54, Bill Wendling wrote: > >> if (__builtin_get_counted_by(p->array)) { >>size_t max_value = >> type_max(typeof(*__builtin_get_counted_by(p->array))); >>if (count > type_max) >>...fail cleanly... >>*__builtin_get_counted_by(p->ar

Re: [PATCH v1] Provide new GCC builtin __builtin_get_counted_by [PR116016]

2024-08-22 Thread Qing Zhao
Hi, Bill, Thank you for the info. > On Aug 21, 2024, at 17:36, Bill Wendling wrote: > >> >> Bill, could you please provide a little bit more info on the possibility of >> a new builtin __builtin_has_attribute() in CLANG? >> > From what I gathered, it would require some moderate surgery to

RE: Re-compute TYPE_MODE and DECL_MODE while streaming in for accelerator

2024-08-22 Thread Prathamesh Kulkarni
> -Original Message- > From: Richard Biener > Sent: Wednesday, August 21, 2024 5:09 PM > To: Prathamesh Kulkarni > Cc: Richard Sandiford ; Thomas Schwinge > ; gcc-patches@gcc.gnu.org > Subject: RE: Re-compute TYPE_MODE and DECL_MODE while streaming in for > accelerator > > External email

Re: [PATCH v2] c++, coroutines: Separate the analysis, ramp and outlined function synthesis.

2024-08-22 Thread Jason Merrill
On 8/22/24 9:30 AM, Iain Sandoe wrote: Hi Jason, + tree stmt = begin_function_body (); As in the last patch, "stmt" seems an obscure name for the result of begin_function_body. done. + /* Avoid the code here attaching a location that makes the debugger jump. */ + location_t save_inpu

Re: [PATCH] optabs-query: Guard smallest_int_mode_for_size [PR115495].

2024-08-22 Thread Robin Dapp
> Indeed though that might be a larger change. I have tested the attached now, aarch64 is still running but x86 and power10 are bootstrapped and regtested, riscv regtested. Hope I didn't miss any target-specific code that I haven't tested. As the issue is only latent I verified by calling get_b

Re: [PATCH v2] c++, coroutines: Split the ramp build into a separate function.

2024-08-22 Thread Jason Merrill
On 8/22/24 7:29 AM, Iain Sandoe wrote: + tree fn_return_type = TREE_TYPE (TREE_TYPE (orig)); /* Ramp: */ + tree stmt = begin_function_body (); The name "stmt" doesn't suggest to me that it's holding the result of begin_function_body. Maybe "ramp_fnbody"? Of course, then there's some

Re: [PATCH] libstdc++: Add some missing ranges feature-test macro tests

2024-08-22 Thread Jonathan Wakely
On Thu, 22 Aug 2024 at 14:31, Patrick Palka wrote: > > Tested on x86_64-pc-linux-gnu, does this look OK for trunk and > perhaps 14? OK for trunk and gcc-14. We didn't backport the fix to move ranges::iota to , so the iota.cc test will need adjustment on the branch. > > -- >8 -- > > libstdc++-v

[PATCH] libstdc++: Add some missing ranges feature-test macro tests

2024-08-22 Thread Patrick Palka
Tested on x86_64-pc-linux-gnu, does this look OK for trunk and perhaps 14? -- >8 -- libstdc++-v3/ChangeLog: * testsuite/25_algorithms/contains/1.cc: Verify value of __cpp_lib_ranges_contains. * testsuite/25_algorithms/find_last/1.cc: Verify value of __cpp_lib_rang

[PATCH v2] c++, coroutines: Separate the analysis, ramp and outlined function synthesis.

2024-08-22 Thread Iain Sandoe
Hi Jason, >>+ tree stmt = begin_function_body (); >As in the last patch, "stmt" seems an obscure name for the result of >begin_function_body. done. >>+ /* Avoid the code here attaching a location that makes the debugger jump. >>*/ >>+ location_t save_input_loc = input_location; >>+ locati

Re: [PATCH v2] tree-optimization/116024 - match.pd: add 4 int-compare simplifications

2024-08-22 Thread Richard Biener
On Wed, 21 Aug 2024, Artemiy Volkov wrote: > Hi, > > sending a v2 of > https://gcc.gnu.org/pipermail/gcc-patches/2024-August/659851.html after > changing variable types in all new testcases from standard to fixed-width. > > Could anyone please assist with reviewing and/or pushing to trunk/14 sin

Re: [PATCH v2] match: Fix A || B not optimized to true when !B implies A [PR114326]

2024-08-22 Thread Richard Biener
On Fri, Aug 16, 2024 at 4:24 PM Konstantinos Eleftheriou wrote: > > From: kelefth > > In expressions like (a != b || ((a ^ b) & CST0) == CST1) and > (a != b || (a ^ b) == CST), (a ^ b) is folded to false. > In the equivalent expressions (((a ^ b) & CST0) == CST1 || a != b) and > ((a ^ b) == CST,

Re: [PATCH] RISC-V: Fix vector cfi notes for stack-clash protection

2024-08-22 Thread Jeff Law
On 8/21/24 3:11 PM, Raphael Moreira Zinsly wrote: The stack-clash code is generating wrong cfi directives in riscv_v_adjust_scalable_frame because REG_CFA_DEF_CFA has a different encoding than REG_FRAME_RELATED_EXPR, this patch fixes the offset sign in prologue and starts using REG_CFA_DEF_CFA

Re: [PATCH] Handle arithmetic on eliminated address indices [PR116413]

2024-08-22 Thread Jeff Law
On 8/22/24 2:46 AM, Richard Sandiford wrote: This patch fixes gcc.c-torture/compile/opout.c for m68k with LRA enabled. The test has: ... z (a, b) { return (int) &a + (int) &b + (int) x + (int) z; } so it adds the address of two incoming arguments. This ends up being treated as an LEA in

Re: [PATCH v3 2/2] Prevent divide-by-zero

2024-08-22 Thread Richard Biener
On Thu, Aug 22, 2024 at 1:03 AM Edwin Lu wrote: > > Hi, > > Just wanted to ping this for more guidance. It's difficult for me as long as I cannot investigate this with a testcase. Can we go ahead with the other parts so the testcase can be added and the issue reproduced? Richard. > Edwin > > O

Re: [PATCH] MATCH: add abs support for half float

2024-08-22 Thread Richard Biener
On Wed, Aug 21, 2024 at 12:08 PM Kugan Vivekanandarajah wrote: > > Hi Richard, > > > On 20 Aug 2024, at 6:09 pm, Richard Biener > > wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Fri, Aug 9, 2024 at 2:39 AM Kugan Vivekanandarajah > > wrote: > >> > >> Than

Re: [PATCH] ifcvt: Disallow emitting call instructions in noce_convert_multiple_sets [PR116358]

2024-08-22 Thread Jeff Law
On 8/22/24 5:04 AM, Manolis Tsamis wrote: Similar to not allowing jump instructions in the generated code, we also shouldn't allow call instructions in noce_convert_multiple_sets. In the case of PR116358 a libcall was generated from force_operand. PR middle-end/116358 gcc/ChangeLog:

[PATCH] libstdc++: Fix std::random_shuffle for low RAND_MAX [PR88935]

2024-08-22 Thread Jonathan Wakely
This is a revised version of a patch Giovanni submitted some years ago, which has been unreviewed until recently. Tested x86_64-linux. I would like to push this to trunk. -- >8 -- When RAND_MAX is small and the number of elements being shuffled is close to it, we get very uneven distributions in

[PATCH 2/2] c++/modules: Fix include translation for already-seen headers [PR99243]

2024-08-22 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? -- >8 -- After importing a header unit we learn about and setup any header modules that we transitively depend on. However, this causes 'set_filename' to fail an assertion if we then come across this header as an #include and attem

Re: [PATCH] fold: Fix `a * 1j` if a has side effects [PR116454]

2024-08-22 Thread Richard Biener
On Thu, Aug 22, 2024 at 1:47 PM Andrew Pinski wrote: > > The problem here was a missing save_expr around arg0 since > it is used twice, once in REALPART_EXPR and once in IMAGPART_EXPR. > Thia adds the save_expr and reformats the code slightly so it is a > little easier to understand. It excludes

[PATCH 1/2] c++/modules: Clean up include translation [PR110980]

2024-08-22 Thread Nathaniel Shead
Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? -- >8 -- Currently the handling of include translation is confusing to read, using a tri-state integer without much clarity on what different states mean. This patch cleans this up to use explicit enumerators indicating the differe

Re: RFH: Debugging GCC segfault with LRA-enabled SH backend

2024-08-22 Thread Richard Biener
On Thu, Aug 22, 2024 at 1:35 PM John Paul Adrian Glaubitz wrote: > > On Thu, 2024-08-22 at 13:05 +0200, Richard Biener wrote: > > > I'm not sure that bisecting works here as I suspect the issue is a result > > > of the LRA switch. > > > > For sure. Still debugging/fixing the testsuite issue will

[PATCH] fold: Fix `a * 1j` if a has side effects [PR116454]

2024-08-22 Thread Andrew Pinski
The problem here was a missing save_expr around arg0 since it is used twice, once in REALPART_EXPR and once in IMAGPART_EXPR. Thia adds the save_expr and reformats the code slightly so it is a little easier to understand. It excludes the case when arg0 is a COMPLEX_EXPR since in that case we'll en

[COMMITED] fix single argument static_assert

2024-08-22 Thread Marc Poulhiès
Single argument static_assert is C++17 only. libcpp/ChangeLog: * lex.cc(search_line_ssse3): fix static_assert to use 2 arguments. --- Pushed to master as obvious. Fixed the CL + added a reason in the assert. Thanks! libcpp/lex.cc | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)

Re: RFH: Debugging GCC segfault with LRA-enabled SH backend

2024-08-22 Thread John Paul Adrian Glaubitz
On Thu, 2024-08-22 at 13:05 +0200, Richard Biener wrote: > > I'm not sure that bisecting works here as I suspect the issue is a result > > of the LRA switch. > > For sure. Still debugging/fixing the testsuite issue will be much easier. > > Does a int main(){} also segfault? I can run the LRA-en

[PATCH v2] c++, coroutines: Split the ramp build into a separate function.

2024-08-22 Thread Iain Sandoe
>>+ tree fn_return_type = TREE_TYPE (TREE_TYPE (orig)); >> /* Ramp: */ >>+ tree stmt = begin_function_body (); >The name "stmt" doesn't suggest to me that it's holding the result of >begin_function_body. Maybe "ramp_fnbody"? Of course, then there's some >confusion with "ramp_body". Shou

Re: [PATCH] fix single argument static_assert

2024-08-22 Thread Alexander Monakov
On Thu, 22 Aug 2024, Marc Poulhiès wrote: > Single argument static_assert is C++17 only. > > libcpp/ChangeLog: > > * lex.cc: fix static_assert to use 2 arguments. When pushing, please fix the entry to mention the function name if that's not too much trouble: * lex.cc (search_line

Re: [PATCH] PR tree-optimization/101390: Vectorize modulo operator

2024-08-22 Thread Richard Biener
On Thu, 22 Aug 2024, Jennifer Schmitz wrote: > On 19 Aug 2024, at 21:02, Richard Sandiford wrote: > > > > External email: Use caution opening links or attachments > > > > > > Jennifer Schmitz writes: > >> Thanks for the comments. I updated the patch accordingly and bootstrapped > >> and test

Re: RFH: Debugging GCC segfault with LRA-enabled SH backend

2024-08-22 Thread Richard Biener
On Thu, Aug 22, 2024 at 12:54 PM John Paul Adrian Glaubitz wrote: > > Hi Richard, > > On Thu, 2024-08-22 at 12:49 +0200, Richard Biener wrote: > > If this is stage2 or stage3 it hints at a miscompile of the stage2/3 > > compiler. I'd concentrate on other > > issues first and suggest to use --disa

  1   2   >