Re: [PATCH] libgccjit: Add ability to get the alignment of a type

2024-06-28 Thread Iain Sandoe
Hi Folks, As noted, it seems to me that the fail here is false positives, but it still needs handling. > On 29 Jun 2024, at 02:28, Iain Sandoe wrote: >> On 28 Jun 2024, at 12:50, Rainer Orth wrote: > … I am going to fix this with the obvious (provide a default init for the > vars) - later to

Re: [PATCH] Hard register asm constraint

2024-06-28 Thread Stefan Schulze Frielinghaus
On Fri, Jun 28, 2024 at 11:46:08AM +0200, Georg-Johann Lay wrote: > Am 27.06.24 um 10:51 schrieb Stefan Schulze Frielinghaus: > > On Thu, Jun 27, 2024 at 09:45:32AM +0200, Georg-Johann Lay wrote: > > > Am 24.05.24 um 11:13 Am 25.06.24 um 16:03 schrieb Paul Koning: > > > > > On Jun 24, 2024, at 1:50

Re: [PATCH] RISC-V: use fclass insns to implement isfinite and isnormal builtins

2024-06-28 Thread Vineet Gupta
On 6/28/24 17:53, Vineet Gupta wrote: > Currently isfinite and isnormal use float compare instructions with fp > flags save/restored around them. Our perf team complained this could be > costly in uarch. RV Base ISA already has FCLASS.{d,s,h} instruction to > do FP compares w/o disturbing FP exc

Re: [PATCH] libgccjit: Add ability to get the alignment of a type

2024-06-28 Thread Iain Sandoe
Hi Folks, > On 28 Jun 2024, at 12:50, Rainer Orth wrote: > > David Malcolm writes: > >> On Thu, 2024-04-04 at 18:59 -0400, Antoni Boucher wrote: >>> Hi. >>> This patch adds a new API to produce an rvalue representing the >>> alignment of a type. >>> Thanks for the review. >> >> Patch looks g

Re: [PATCH] RISC-V: use fclass insns to implement isfinite and isnormal builtins

2024-06-28 Thread Andrew Waterman
+1 to any change that reduces the number of fflags accesses. On Fri, Jun 28, 2024 at 5:54 PM Vineet Gupta wrote: > > Currently isfinite and isnormal use float compare instructions with fp > flags save/restored around them. Our perf team complained this could be > costly in uarch. RV Base ISA alr

[PATCH] RISC-V: use fclass insns to implement isfinite and isnormal builtins

2024-06-28 Thread Vineet Gupta
Currently isfinite and isnormal use float compare instructions with fp flags save/restored around them. Our perf team complained this could be costly in uarch. RV Base ISA already has FCLASS.{d,s,h} instruction to do FP compares w/o disturbing FP exception flags. Coincidently, upstream ijust few d

[committed] Fix mcore-elf regression after recent IRA change

2024-06-28 Thread Jeff Law
So the recent IRA change exposed a bug in the mcore backend. The mcore has a special instruction (xtrb3) which can zero extend a GPR into R1. It's useful because zextb requires a matching source/destination. Unfortunately xtrb3 modifies CC. The IRA changes twiddle register allocation such t

Re: [PATCH] Fortran: fix ALLOCATE with SOURCE of deferred character length [PR114019]

2024-06-28 Thread Steve Kargl
On Fri, Jun 28, 2024 at 10:00:53PM +0200, Harald Anlauf wrote: > > the attached patch fixes an ICE occuring for ALLOCATE with SOURCE > (or MOLD) of deferred character length in the scalar case, which > looked obscure because the ICE disappears at -O1 and higher. > > The dump tree suggests that it

[PATCH] c++: DR2627, Bit-fields and narrowing conversions [PR94058]

2024-06-28 Thread Marek Polacek
Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk? -- >8 -- This DR (https://cplusplus.github.io/CWG/issues/2627.html) says that even if we are converting from an integer type or unscoped enumeration type to an integer type that cannot represent all the values of the original type, it's

Re: [PATCH] c++: Relax too strict assert in stabilize_expr [PR111160]

2024-06-28 Thread Patrick Palka
On Wed, 26 Jun 2024, Simon Martin wrote: > The case in the ticket is an ICE on invalid due to an assert in > stabilize_expr, > but the underlying issue can actually trigger on this *valid* code: > > === cut here === > struct TheClass { > TheClass() {} > TheClass(volatile TheClass& t) {} >

Re: [PATCH] c++: Fix ICE locating 'this' for (not matching) template member function [PR115364]

2024-06-28 Thread Patrick Palka
On Fri, 28 Jun 2024, Simon Martin wrote: > We currently ICE when emitting the error message for this invalid code: > > === cut here === > struct foo { > template void not_const() {} > }; > void fn(const foo& obj) { > obj.not_const<5>(); > } > === cut here === > > The problem is that get_fnde

[PATCH] Fortran: fix ALLOCATE with SOURCE of deferred character length [PR114019]

2024-06-28 Thread Harald Anlauf
Dear all, the attached patch fixes an ICE occuring for ALLOCATE with SOURCE (or MOLD) of deferred character length in the scalar case, which looked obscure because the ICE disappears at -O1 and higher. The dump tree suggests that it is a wrong decl for the temporary source that was e.g.

[committed] libstdc++: Define __glibcxx_assert_fail for non-verbose build [PR115585]

2024-06-28 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. Backports needed. -- >8 -- When the library is configured with --disable-libstdcxx-verbose the assertions just abort instead of calling __glibcxx_assert_fail, and so I didn't export that function for the non-verbose build. However, that option is documented t

[committed] libstdc++: Extend std::equal memcmp optimization to std::byte [PR101485]

2024-06-28 Thread Jonathan Wakely
Tested x86_64-linux. Pushed to trunk. -- >8 -- We optimize std::equal to memcmp for integers and pointers, which means that std::byte comparisons generate bigger code than char comparisons. We can't use memcmp for arbitrary enum types, because they could have an overloaded operator== that has cu

Re: [PATCH 2/2] libstdc++: Do not use C++11 alignof in C++98 mode [PR104395]

2024-06-28 Thread Jonathan Wakely
Pushed to trunk. On Thu, 27 Jun 2024 at 10:01, Jonathan Wakely wrote: > > As I commented in the PR, I think it would be nice if the compiler > accepted C++11 alignof in C++98 mode when -faligned-new is used. But > even if G++ added that, we'd need Clang to use it, and then wait a few > releases f

Re: [PATCH 1/2] libstdc++: Simplify class templates

2024-06-28 Thread Jonathan Wakely
Pushed to trunk. On Thu, 27 Jun 2024 at 10:03, Jonathan Wakely wrote: > > I'm planning to push this, although arguably the first change isn't > worth doing if we can't use it everywhere. If we need to keep the old > code for EDG, maybe we should just keep using that? The new version > probably co

RE: [PATCH v6] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-28 Thread Pengxuan Zheng (QUIC)
> > On 6/28/24 6:18 AM, Pengxuan Zheng wrote: > > > This patch improves GCC’s vectorization of __builtin_popcount for > > > aarch64 target by adding popcount patterns for vector modes besides > > > QImode, i.e., HImode, SImode and DImode. > > > > > > With this patch, we now generate the following f

[PATCH v9] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-28 Thread Pengxuan Zheng
This patch improves GCC’s vectorization of __builtin_popcount for aarch64 target by adding popcount patterns for vector modes besides QImode, i.e., HImode, SImode and DImode. With this patch, we now generate the following for V8HI: cnt v1.16b, v0.16b uaddlp v2.8h, v1.16b For V4HI, we gen

[COMMITTED] ssa_lazy_cache takes an optional bitmap_obstack pointer.

2024-06-28 Thread Andrew MacLeod
There are times when a  bitmap_obstack could be provided to the lazy cache, in which case it does not need to manage an obstack on its own. fast_vrp can have a few  of these live at once, and I anticipate some changes to GORI where we may use them a bit more too, so this just provides a little

RE: [PATCH v6] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-28 Thread Pengxuan Zheng (QUIC)
> On 6/28/24 6:18 AM, Pengxuan Zheng wrote: > > This patch improves GCC’s vectorization of __builtin_popcount for > > aarch64 target by adding popcount patterns for vector modes besides > > QImode, i.e., HImode, SImode and DImode. > > > > With this patch, we now generate the following for V8HI: > >

[PATCH v8] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-28 Thread Pengxuan Zheng
This patch improves GCC’s vectorization of __builtin_popcount for aarch64 target by adding popcount patterns for vector modes besides QImode, i.e., HImode, SImode and DImode. With this patch, we now generate the following for V8HI: cnt v1.16b, v0.16b uaddlp v2.8h, v1.16b For V4HI, we gen

RE: [PATCH v7] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-28 Thread Pengxuan Zheng (QUIC)
Please ignore this patch. I accidently added unrelated changes. I'll push a correct version shortly. Sorry for the noise. Thanks, Pengxuan > This patch improves GCC’s vectorization of __builtin_popcount for aarch64 > target by adding popcount patterns for vector modes besides QImode, i.e., > HIm

[PATCH v7] aarch64: Add vector popcount besides QImode [PR113859]

2024-06-28 Thread Pengxuan Zheng
This patch improves GCC’s vectorization of __builtin_popcount for aarch64 target by adding popcount patterns for vector modes besides QImode, i.e., HImode, SImode and DImode. With this patch, we now generate the following for V8HI: cnt v1.16b, v0.16b uaddlp v2.8h, v1.16b For V4HI, we gen

[wwwdocs, committed] git: Move current devel/omp/gcc branch to 14

2024-06-28 Thread Paul-Antoine Arras
Committed as debf3885965604c81541a549d531ec450f498058 https://gcc.gnu.org/git.html#general -- PAcommit debf3885965604c81541a549d531ec450f498058 Author: Paul-Antoine Arras Date: Fri Jun 28 12:08:57 2024 +0200 git: Move current devel/omp/gcc branch to 14 diff --git htdocs/git.html htdocs/gi

Re: nvptx vs. [PATCH] Add a late-combine pass [PR106594]

2024-06-28 Thread Richard Sandiford
Richard Sandiford writes: > Thomas Schwinge writes: >> Hi! >> >> On 2024-06-27T23:20:18+0200, I wrote: >>> On 2024-06-27T22:27:21+0200, I wrote: On 2024-06-27T18:49:17+0200, I wrote: > On 2023-10-24T19:49:10+0100, Richard Sandiford > wrote: >> This patch adds a combine pass tha

[PATCH] i386: Cleanup tmp variable usage in ix86_expand_move

2024-06-28 Thread Uros Bizjak
Remove extra assignment, extra temp variable and variable shadowing. No functional changes intended. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_move): Remove extra assignment to tmp variable, reuse tmp variable instead of declaring new temporary variable and remove tmp

Re: [PATCH v3] Arm: Fix disassembly error in Thumb-1 relaxed load/store [PR115188]

2024-06-28 Thread Richard Earnshaw (lists)
On 27/06/2024 17:16, Wilco Dijkstra wrote: > Hi Richard, > >> Doing just this will mean that the register allocator will have to undo a >> pre/post memory operand that was accepted by the predicate (memory_operand).  >> I think we really need a tighter predicate (lets call it noautoinc_mem_op)

Document 'pass_postreload' vs. 'pass_late_compilation' (was: The nvptx port [4/11+] Post-RA pipeline)

2024-06-28 Thread Thomas Schwinge
Hi! Before we start looking into enabling certain 'pass_postreload' passes for nvptx, as we've been discussing in "nvptx vs. [PATCH] Add a late-combine pass [PR106594]", let's first document the (not quite obvious)

RE: [PATCH v3] Vect: Support truncate after .SAT_SUB pattern in zip

2024-06-28 Thread Li, Pan2
Thanks Tamar and Richard for enlightening. > I think you're doing the MIN_EXPR wrong - the above says MIN_EXPR > which doesn't make > sense anyway. I suspect you fail to put the MIN_EXPR to a separate statement? Make sense, will have another try for this. > Aye, you need to emit the additional

Re: [PATCHv2 2/2] libiberty/buildargv: handle input consisting of only white space

2024-06-28 Thread Andrew Burgess
Hi, Am I OK to push these patches given the testing went OK? I'm thinking probably, but I don't want to overstep. Thanks, Andrew Andrew Burgess writes: > Jeff Law writes: > >> On 2/10/24 10:26 AM, Andrew Burgess wrote: >>> GDB makes use of the libiberty function buildargv for splitting th

RE: [PATCH v1] Match: Support imm form for unsigned scalar .SAT_ADD

2024-06-28 Thread Li, Pan2
> OK with those changes. Thanks Richard for comments, will make the changes and commit if no surprise from test suites. Pan -Original Message- From: Richard Biener Sent: Friday, June 28, 2024 9:12 PM To: Li, Pan2 Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com

[PATCH] RISC-V: Handle NULL stmt in SLP_TREE_SCALAR_STMTS

2024-06-28 Thread Richard Biener
The following starts to handle NULL elements in SLP_TREE_SCALAR_STMTS with the first candidate being the two-operator nodes where some lanes are do-not-care and also do not have a scalar stmt computing the result. I've sofar whack-a-moled the vect.exp testsuite. I do plan to use NULL elements for

[PATCH][v2] RISC-V: Harden SLP reduction support wrt STMT_VINFO_REDUC_IDX

2024-06-28 Thread Richard Biener
The following makes sure that for a SLP reductions all lanes have the same STMT_VINFO_REDUC_IDX. Once we move that info and can adjust it we can implement swapping. It also makes the existing protection against operand swapping trigger for all stmts participating in a reduction, not just the fina

Re: nvptx vs. [PATCH] Add a late-combine pass [PR106594]

2024-06-28 Thread Richard Sandiford
Thomas Schwinge writes: > Hi! > > On 2024-06-27T23:20:18+0200, I wrote: >> On 2024-06-27T22:27:21+0200, I wrote: >>> On 2024-06-27T18:49:17+0200, I wrote: On 2023-10-24T19:49:10+0100, Richard Sandiford wrote: > This patch adds a combine pass that runs late in the pipeline. >>> >>>

RE: [PATCH v3] Vect: Support truncate after .SAT_SUB pattern in zip

2024-06-28 Thread Tamar Christina
> -Original Message- > From: Richard Biener > Sent: Friday, June 28, 2024 6:39 AM > To: Li, Pan2 > Cc: gcc-patches@gcc.gnu.org; juzhe.zh...@rivai.ai; kito.ch...@gmail.com; > jeffreya...@gmail.com; rdapp@gmail.com; Tamar Christina > > Subject: Re: [PATCH v3] Vect: Support truncate aft

Re: [PATCH] Use move-aware auto_vec in map

2024-06-28 Thread Jørgen Kvalsvik
On 6/28/24 13:55, Richard Biener wrote: On Fri, Jun 28, 2024 at 8:43 AM Jørgen Kvalsvik wrote: Using auto_vec rather than vec for means the vectors are release automatically upon return, to stop the leak. The problem seems is that auto_vec is not really move-aware, only the specialization is.

Re: [PATCH v1] Match: Support imm form for unsigned scalar .SAT_ADD

2024-06-28 Thread Richard Biener
On Fri, Jun 28, 2024 at 5:44 AM wrote: > > From: Pan Li > > This patch would like to support the form of unsigned scalar .SAT_ADD > when one of the op is IMM. For example as below: > > Form IMM: > #define DEF_SAT_U_ADD_IMM_FMT_1(T) \ > T __attribute__((noinline)) \ > sat

Re: [PATCH 7/8] vect: Support multiple lane-reducing operations for loop reduction [PR114440]

2024-06-28 Thread Richard Biener
On Wed, Jun 26, 2024 at 4:50 PM Feng Xue OS wrote: > > Updated the patch. > > For lane-reducing operation(dot-prod/widen-sum/sad) in loop reduction, current > vectorizer could only handle the pattern if the reduction chain does not > contain other operation, no matter the other is normal or lane-r

Handle 'NUM' in 'PUSH_INSERT_PASSES_WITHIN' (was: [PATCH 03/11] Handwritten part of conversion of passes to C++ classes)

2024-06-28 Thread Thomas Schwinge
Hi! As part of this: On 2013-07-26T11:04:33-0400, David Malcolm wrote: > This patch is the hand-written part of the conversion of passes from > C structs to C++ classes. > --- a/gcc/passes.c > +++ b/gcc/passes.c ..., we did hard-code 'PUSH_INSERT_PASSES_WITHIN(PASS)' to always refer to the fir

Re: [PATCH 4/8] vect: Determine input vectype for multiple lane-reducing

2024-06-28 Thread Richard Biener
On Wed, Jun 26, 2024 at 4:48 PM Feng Xue OS wrote: > > Updated the patches based on comments. > > The input vectype of reduction PHI statement must be determined before > vect cost computation for the reduction. Since lance-reducing operation has > different input vectype from normal one, so we ne

Re: Rewrite usage comment at the top of 'gcc/passes.def' (was: [PATCH 02/11] Generate pass-instances.def)

2024-06-28 Thread Richard Biener
On Fri, Jun 28, 2024 at 2:14 PM Thomas Schwinge wrote: > > Hi! > > On 2013-07-26T11:04:32-0400, David Malcolm wrote: > > Introduce a new gen-pass-instances.awk script, and use it at build time > > to make a pass-instances.def from passes.def. > > (The script has later been rewritten and extended,

Re: LoongArch vs. [PATCH 0/6] Add a late-combine pass

2024-06-28 Thread chenglulu
在 2024/6/28 下午8:35, Xi Ruoyao 写道: On Fri, 2024-06-28 at 20:34 +0800, chenglulu wrote: 在 2024/6/28 下午8:25, Xi Ruoyao 写道: Hi Richard, The late combine pass has triggered some FAILs on LoongArch and I'm investigating.  One of them is movcf2gr-via-fr.c.  In 315r.postreload: (insn 22 7 24 2 (set

Re: LoongArch vs. [PATCH 0/6] Add a late-combine pass

2024-06-28 Thread Xi Ruoyao
On Fri, 2024-06-28 at 20:34 +0800, chenglulu wrote: > > 在 2024/6/28 下午8:25, Xi Ruoyao 写道: > > Hi Richard, > > > > The late combine pass has triggered some FAILs on LoongArch and I'm > > investigating.  One of them is movcf2gr-via-fr.c.  In > > 315r.postreload: > > > > (insn 22 7 24 2 (set (reg:F

Re: LoongArch vs. [PATCH 0/6] Add a late-combine pass

2024-06-28 Thread chenglulu
在 2024/6/28 下午8:25, Xi Ruoyao 写道: Hi Richard, The late combine pass has triggered some FAILs on LoongArch and I'm investigating. One of them is movcf2gr-via-fr.c. In 315r.postreload: (insn 22 7 24 2 (set (reg:FCC 32 $f0 [87]) (reg:FCC 64 $fcc0 [87])) "../gcc/gcc/testsuite/gcc.targ

Re: [PATCH] Fix native_encode_vector_part for itype when TYPE_PRECISION (itype) == BITS_PER_UNIT

2024-06-28 Thread Richard Sandiford
Richard Biener writes: > On Fri, Jun 28, 2024 at 2:16 PM Richard Biener > wrote: >> >> On Fri, Jun 28, 2024 at 11:06 AM Richard Biener >> wrote: >> > >> > >> > >> > > Am 28.06.2024 um 10:27 schrieb Richard Sandiford >> > > : >> > > >> > > Richard Biener writes: >> > >>> On Fri, Jun 28, 2024 a

LoongArch vs. [PATCH 0/6] Add a late-combine pass

2024-06-28 Thread Xi Ruoyao
Hi Richard, The late combine pass has triggered some FAILs on LoongArch and I'm investigating. One of them is movcf2gr-via-fr.c. In 315r.postreload: (insn 22 7 24 2 (set (reg:FCC 32 $f0 [87]) (reg:FCC 64 $fcc0 [87])) "../gcc/gcc/testsuite/gcc.target/loongarch/movcf2gr-via-fr.c":9:12 16

Re: [PATCH] Fix native_encode_vector_part for itype when TYPE_PRECISION (itype) == BITS_PER_UNIT

2024-06-28 Thread Richard Biener
On Fri, Jun 28, 2024 at 2:16 PM Richard Biener wrote: > > On Fri, Jun 28, 2024 at 11:06 AM Richard Biener > wrote: > > > > > > > > > Am 28.06.2024 um 10:27 schrieb Richard Sandiford > > > : > > > > > > Richard Biener writes: > > >>> On Fri, Jun 28, 2024 at 8:01 AM Richard Biener > > >>> wrote

Re: [PATCH] Fix native_encode_vector_part for itype when TYPE_PRECISION (itype) == BITS_PER_UNIT

2024-06-28 Thread Richard Biener
On Fri, Jun 28, 2024 at 11:06 AM Richard Biener wrote: > > > > > Am 28.06.2024 um 10:27 schrieb Richard Sandiford > > : > > > > Richard Biener writes: > >>> On Fri, Jun 28, 2024 at 8:01 AM Richard Biener > >>> wrote: > >>> > >>> On Fri, Jun 28, 2024 at 3:15 AM liuhongt wrote: > > fo

Rewrite usage comment at the top of 'gcc/passes.def' (was: [PATCH 02/11] Generate pass-instances.def)

2024-06-28 Thread Thomas Schwinge
Hi! On 2013-07-26T11:04:32-0400, David Malcolm wrote: > Introduce a new gen-pass-instances.awk script, and use it at build time > to make a pass-instances.def from passes.def. (The script has later been rewritten and extended, but the issue I'm discussing is relevant already in its original vers

Re: [PATCH] Use move-aware auto_vec in map

2024-06-28 Thread Richard Biener
On Fri, Jun 28, 2024 at 8:43 AM Jørgen Kvalsvik wrote: > > Using auto_vec rather than vec for means the vectors are release > automatically upon return, to stop the leak. The problem seems is that > auto_vec is not really move-aware, only the specialization > is. Indeed. > This is actually Jan'

Re: [PATCH] libgccjit: Add ability to get the alignment of a type

2024-06-28 Thread Rainer Orth
David Malcolm writes: > On Thu, 2024-04-04 at 18:59 -0400, Antoni Boucher wrote: >> Hi. >> This patch adds a new API to produce an rvalue representing the >> alignment of a type. >> Thanks for the review. > > Patch looks good to me (but may need the usual ABI version updates when > merging). Th

Re: [PATCH] i386: Fix regression after refactoring legitimize_pe_coff_symbol, ix86_GOT_alias_set and PE_COFF_LEGITIMIZE_EXTERN_DECL

2024-06-28 Thread Uros Bizjak
On Fri, Jun 28, 2024 at 1:41 PM Evgeny Karpov wrote: > > Thursday, June 27, 2024 8:13 PM > Uros Bizjak wrote: > > > > > So, there is no problem having #endif just after else. > > > > Anyway, it's your call, this is not a hill I'm willing to die on. ;) > > > > Thanks, > > Uros. > > It looks like t

[PATCH] tree-optimization/115652 - more fixing of the fix

2024-06-28 Thread Richard Biener
The following addresses the corner case of an outer loop with an empty header where we end up asking for the BB of a NULL stmt by special-casing this case. Bootstrap and regtest running on x86_64-unknown-linux-gnu, the patch fixes observed ICEs on GCN. PR tree-optimization/115652

[PATCH] i386: Fix regression after refactoring legitimize_pe_coff_symbol, ix86_GOT_alias_set and PE_COFF_LEGITIMIZE_EXTERN_DECL

2024-06-28 Thread Evgeny Karpov
Thursday, June 27, 2024 8:13 PM Uros Bizjak wrote: > > So, there is no problem having #endif just after else. > > Anyway, it's your call, this is not a hill I'm willing to die on. ;) > > Thanks, > Uros. It looks like the patch resolves 3 reported issues. Uros, I suggest merging the patch as i

RE: [RFC PATCH] cse: Add another CSE pass after split1

2024-06-28 Thread Tamar Christina
Hi, > -Original Message- > From: Palmer Dabbelt > Sent: Thursday, June 27, 2024 10:57 PM > To: gcc-patches@gcc.gnu.org > Cc: Palmer Dabbelt > Subject: [RFC PATCH] cse: Add another CSE pass after split1 > > This is really more of a question than a patch. > > Looking at PR/115687 I manag

Re: Re: [PATCH 0/2] fix RISC-V zcmp popretz [PR113715]

2024-06-28 Thread Fei Gao
On 2024-06-09 04:36  Jeff Law wrote: > > > >On 6/5/24 8:42 PM, Fei Gao wrote: > >>> But let's back up and get a good explanation of what the problem is. >>> Based on patch 2/2 it looks like we have lost an assignment to the >>> return register. >>> >>> To someone not familiar with this code, it so

Re: [PATCH v2] MIPS: Output $0 for conditional trap if !ISA_HAS_COND_TRAPI

2024-06-28 Thread Maciej W. Rozycki
On Fri, 28 Jun 2024, YunQiang Su wrote: > > > > Overall ISTM there is no need for distinct insns for ISA_HAS_COND_TRAPI > > > > and !ISA_HAS_COND_TRAPI cases each and this would better be sorted with > > > > predicates and constraints, especially as the output pattern is the same > > > > in both

[PATCH v2 8/8] libgomp: Map omp_default_mem_space to USM

2024-06-28 Thread Andrew Stubbs
When unified shared memory is required, the default memory space should also be unified. libgomp/ChangeLog: * config/linux/allocator.c (linux_memspace_alloc): Check omp_requires_mask. (linux_memspace_calloc): Likewise. (linux_memspace_free): Likewise. (linu

[PATCH v2 6/8] amdgcn: libgomp plugin USM implementation

2024-06-28 Thread Andrew Stubbs
From: Andrew Stubbs Implement the Unified Shared Memory API calls in the GCN plugin. The AMD equivalent of "Managed Memory" means registering previously allocated host memory as "coarse-grained" (whereas allocating coarse-grained memory via hsa_allocate_memory allocates device-side memory, initi

[PATCH v2 7/8] openmp, libgomp: Handle unified shared memory in omp_target_is_accessible

2024-06-28 Thread Andrew Stubbs
From: Marcel Vollweiler This patch handles Unified Shared Memory (USM) in the OpenMP runtime routine omp_target_is_accessible. libgomp/ChangeLog: * target.c (omp_target_is_accessible): Handle unified shared memory. * testsuite/libgomp.c-c++-common/target-is-accessible-1.c: Updat

[PATCH v2 5/8] amdgcn, openmp: Auto-detect USM mode and set HSA_XNACK

2024-06-28 Thread Andrew Stubbs
From: Andrew Stubbs The AMD GCN runtime must be set to the correct mode for Unified Shared Memory to work, but this is not always clear at compile and link time due to the split nature of the offload compilation pipeline. This patch sets a new attribute on OpenMP offload functions to ensure that

[PATCH v2 4/8] openmp: Use libgomp memory allocation functions with unified shared memory.

2024-06-28 Thread Andrew Stubbs
From: Hafiz Abid Qadeer This patches changes calls to malloc/free/calloc/realloc and operator new to memory allocation functions in libgomp with allocator=ompx_unified_shared_mem_alloc. This helps existing code to benefit from the unified shared memory, and is necessary to implement "requires un

[PATCH v2 2/8] openmp, nvptx: ompx_gnu_unified_shared_mem_alloc

2024-06-28 Thread Andrew Stubbs
From: Andrew Stubbs This adds support for using Cuda Managed Memory with omp_alloc. It will be used as the underpinnings for "requires unified_shared_memory" in a later patch. There are two new predefined allocators, ompx_gnu_unified_shared_mem_alloc and ompx_gnu_host_mem_alloc, plus correspond

[PATCH v2 3/8] openmp: Enable -foffload-memory=unified

2024-06-28 Thread Andrew Stubbs
From: Andrew Stubbs Ensure that "requires unified_shared_memory" plays nicely with the -foffload-memory options, and that enabling the option has the same effect as enabling USM in the code. Also adds some testcases. gcc/c/ChangeLog: * c-parser.cc (c_parser_omp_target): Add OMP

[PATCH v2 0/8] OpenMP: Unified Shared Memory via Managed Memory

2024-06-28 Thread Andrew Stubbs
These patched are an evolution of the USM portion of the patches previously posted in July 2022 (yes, it's taken a while!) https://patchwork.sourceware.org/project/gcc/list/?series=10748&state=%2A&archive=both The pinned memory portion was already posted (and partially approved already) and must

[PATCH v2 1/8] libgomp: Disentangle shared memory from managed

2024-06-28 Thread Andrew Stubbs
Some GPU compute systems allow the GPU to access host memory without much prior setup, but that's not necessarily the fast way to do it. For shared memory APUs this is almost certainly the correct choice, but for AMD there is the difference between "fine-grained" and "coarse-grained" memory, and f

Re: [PATCH] Hard register asm constraint

2024-06-28 Thread Georg-Johann Lay
Am 27.06.24 um 10:51 schrieb Stefan Schulze Frielinghaus: On Thu, Jun 27, 2024 at 09:45:32AM +0200, Georg-Johann Lay wrote: Am 24.05.24 um 11:13 Am 25.06.24 um 16:03 schrieb Paul Koning: On Jun 24, 2024, at 1:50 AM, Stefan Schulze Frielinghaus wrote: On Mon, Jun 10, 2024 at 07:19:19AM +0200,

[PATCH] c++: Fix ICE locating 'this' for (not matching) template member function [PR115364]

2024-06-28 Thread Simon Martin
We currently ICE when emitting the error message for this invalid code: === cut here === struct foo { template void not_const() {} }; void fn(const foo& obj) { obj.not_const<5>(); } === cut here === The problem is that get_fndecl_argument_location assumes that it has a FUNCTION_DECL in its ha

[PATCH] Remove unused hybrid_* operators in range-ops.

2024-06-28 Thread Aldy Hernandez
Now that the dust has settled on the prange work, we can remove the hybrid operators. I will push this once tests complete. gcc/ChangeLog: * range-op-ptr.cc (class hybrid_and_operator): Remove. (class hybrid_or_operator): Same. (class hybrid_min_operator): Same. (

Re: [Patch, Fortran] 2/3 Refactor locations where _vptr is (re)set.

2024-06-28 Thread Andre Vehreschild
Hi Paul, thanks for the review. I have removed the commented assert and committed as gcc-15-1704-gaa3599a10ca What about your pr59104 patch? It is living happily in my dev-branch and making no problems. Thanks again and regards, Andre On Thu, 27 Jun 2024 07:29:40 +0100 Paul Richard Tho

Re: [PATCH 2/3] libstdc++: Optimize __uninitialized_default using memset

2024-06-28 Thread Jonathan Wakely
On Fri, 28 Jun 2024 at 07:53, Maciej Cencora wrote: > > But constexpr-ness of bit_cast has additional limitations and e.g. providing > an union as _Tp would be a hard-error. So we have two options: > - before bitcasting check if type can be bitcast-ed at compile-time, > - change the 'if constex

Re: [PATCH] Fix native_encode_vector_part for itype when TYPE_PRECISION (itype) == BITS_PER_UNIT

2024-06-28 Thread Richard Biener
> Am 28.06.2024 um 10:27 schrieb Richard Sandiford : > > Richard Biener writes: >>> On Fri, Jun 28, 2024 at 8:01 AM Richard Biener >>> wrote: >>> >>> On Fri, Jun 28, 2024 at 3:15 AM liuhongt wrote: for the testcase in the PR115406, here is part of the dump. char D.48

Re: [PATCH] Fix native_encode_vector_part for itype when TYPE_PRECISION (itype) == BITS_PER_UNIT

2024-06-28 Thread Richard Sandiford
Richard Biener writes: > On Fri, Jun 28, 2024 at 8:01 AM Richard Biener > wrote: >> >> On Fri, Jun 28, 2024 at 3:15 AM liuhongt wrote: >> > >> > for the testcase in the PR115406, here is part of the dump. >> > >> > char D.4882; >> > vector(1) _1; >> > vector(1) signed char _2; >> > char

[PATCH] MIPS/testsuite: Fix umips-save-restore-1.c

2024-06-28 Thread YunQiang Su
With some recent optimization, -O1/-O2/-O3 can archive almost same performace/size by stack load/store. Thus lwm/swm will save/store less callee-saved register. In fact only $16 is saved with swm. To be sure that this optimization does exist, let's add 2 more function calls. So that lwm/swm can

Re: [RFC PATCH] cse: Add another CSE pass after split1

2024-06-28 Thread Oleg Endo
Hi, On Thu, 2024-06-27 at 14:56 -0700, Palmer Dabbelt wrote: > This is really more of a question than a patch. > > Looking at PR/115687 I managed to convince myself there's a general > class of problems here: splitting might produce constant subexpressions, > but as far as I can tell there's noth

Re: [PATCH] aarch64: Remove RNG and MTE from -mcpu=neoverse-v2

2024-06-28 Thread Kyrylo Tkachov
On 27 Jun 2024, at 16:58, Tamar Christina wrote: External email: Use caution opening links or attachments -Original Message- From: Kyrylo Tkachov mailto:ktkac...@nvidia.com>> Sent: Thursday, June 27, 2024 3:49 PM To: Tamar Christina mailto:tamar.christ...@arm.com>> Cc: gcc-patches@gcc

Re: [PATCH] jit: Ensure ssize_t is defined.

2024-06-28 Thread FX Coudert
> But isn't the bigger issue that sys/types.h isn't guaranteed to contain > a declaration of ssize_t? And that when sys/types.h isn't available > we don't get ssize_t from it either? Some targets seem to get it indirectly from stdio.h As far as I know, darwin is the only platform broken currently