[PATCH] Match: Fold pow calls to ldexp when possible [PR57492]

2024-10-27 Thread Soumya AR
This patch transforms the following POW calls to equivalent LDEXP calls, as discussed in PR57492: powi (2.0, i) -> ldexp (1.0, i) a * powi (2.0, i) -> ldexp (a, i) 2.0 * powi (2.0, i) -> ldexp (1.0, i + 1) pow (powof2, i) -> ldexp (1.0, i * log2 (powof2)) powof2 * pow (2, i) -> ldexp (1.0, i +

[PATCH] Match: Optimize log (x) CMP CST and exp (x) CMP CST operations

2024-10-27 Thread Soumya AR
This patch implements transformations for the following optimizations. logN(x) CMP CST -> x CMP expN(CST) expN(x) CMP CST -> x CMP logN(CST) For example: int foo (float x) { return __builtin_logf (x) < 0.0f; } can just be: int foo (float x) { return x < 1.0f; } The patch was bootstrapped

[r15-4709 Regression] FAIL: 23_containers/vector/cons/from_range.cc -std=gnu++26 (test for excess errors) on Linux/x86_64

2024-10-27 Thread haochen.jiang
On Linux/x86_64, b281e13ecad12d07209924a7282c53be3a1c3774 is the first bad commit commit b281e13ecad12d07209924a7282c53be3a1c3774 Author: Jonathan Wakely Date: Tue Oct 8 21:15:18 2024 +0100 libstdc++: Add P1206R7 from_range members to std::vector [PR111055] caused FAIL: 23_containers/vec

[PATCH] phiopt: Move check for maybe_undef_p slightly earlier

2024-10-27 Thread Andrew Pinski
This moves the check for maybe_undef_p in match_simplify_replacement slightly earlier before figuring out the true/false arg using arg0/arg1 instead. In most cases this is no difference in compile time; just in the case there is an undef in the args there would be a slight compile time improvement

[PATCH] vec-lowering: Fix ABSU lowering [PR111285]

2024-10-27 Thread Andrew Pinski
ABSU_EXPR lowering incorrectly used the resulting type for the new expression but in the case of ABSU the resulting type is an unsigned type and with ABSU is folded away. The fix is to use a signed type for the expression instead. Bootstrapped and tested on x86_64-linux-gnu. PR middle-end

Re: [PATCH #4/7] adjust update_profile_after_ifcombine for noncontiguous ifcombine

2024-10-27 Thread Jeff Law
On 10/25/24 5:54 AM, Alexandre Oliva wrote: Prepare for ifcombining noncontiguous blocks, adding (still unused) logic to the ifcombine profile updater to handle such cases. for gcc/ChangeLog * tree-ssa-ifcombine.cc (known_succ_p): New. (update_profile_after_ifcombine): Han

Re: [PATCH] RISC-V: Fix rvv builtin function groups registration asynchronously.

2024-10-27 Thread Jeff Law
On 10/22/24 12:26 AM, KuanLin Chen wrote: In the origin, cc1 registers rvv builtins with turn on all sub vector extensions but lto not. It makes lto use the asynchronous DECL_MD_FUNCTION_CODE from lto-objects. Example: riscv64-unknown-elf-gcc -flto gcc/testsuite/gcc.target/riscv/rvv/base/bug

Re: [PATCH] RISC-V: Remove skip of decl in registered_function.

2024-10-27 Thread Jeff Law
On 10/22/24 12:24 AM, KuanLin Chen wrote: The GTY skip makes GGC clean the registered functions wrongly in lto. Example: riscv64-unknown-elf-gcc -flto gcc/testsuite/gcc.target/riscv/rvv/base/bug-3.c -O2 -march=rv64gcv In file included from bug-3.c:2: internal compiler error: Segmentation fau

Re: [PATCH #3/7] introduce ifcombine_replace_cond

2024-10-27 Thread Jeff Law
On 10/25/24 5:52 AM, Alexandre Oliva wrote: Refactor ifcombine_ifandif, moving the common code from the various paths that apply the combined condition to a new function. for gcc/ChangeLog * tree-ssa-ifcombine.cc (ifcombine_replace_cond): Factor out of... (ifcombin

[committed] libstdc++: Fix std::vector::emplace to forward parameter

2024-10-27 Thread Jonathan Wakely
If the parameter is not lvalue-convertible to bool then the current code will fail to compile. The parameter should be forwarded to restore the original value category. libstdc++-v3/ChangeLog: * include/bits/stl_bvector.h (emplace_back, emplace): Forward parameter pack to preserve

Re: [PATCH v2 8/8] RISC-V: Add else operand to masked loads [PR115336].

2024-10-27 Thread Jeff Law
On 10/18/24 8:22 AM, Robin Dapp wrote: This patch adds else operands to masked loads. Currently the default else operand predicate accepts "undefined" (i.e. SCRATCH) as well as all-ones values. Note that this series introduces a large number of new RVV FAILs for riscv. All of them are due t

Re: [PATCH #2/7] drop redundant ifcombine_ifandif parm

2024-10-27 Thread Jeff Law
On 10/25/24 5:51 AM, Alexandre Oliva wrote: In preparation to changes that may modify both inner and outer conditions in ifcombine, drop the redundant parameter result_inv, that is always identical to inner_inv. for gcc/ChangeLog * tree-ssa-ifcombine.cc (ifcombine_ifandif): Drop r

Re: [PATCH #1/7] allow vuses in ifcombine blocks

2024-10-27 Thread Jeff Law
On 10/25/24 5:50 AM, Alexandre Oliva wrote: Disallowing vuses in blocks for ifcombine is too strict, and it prevents usefully moving fold_truth_andor into ifcombine. That tree-level folder has long ifcombined loads, absent other relevant side effects. for gcc/ChangeLog * tree-ssa

Re: [PATCH v4 2/2] RISC-V: Add testcases for unsigned .SAT_SUB form 2 with IMM = 1.

2024-10-27 Thread Jeff Law
On 10/24/24 7:22 PM, Li Xu wrote: From: xuli form2: T __attribute__((noinline)) \ sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \ { \ return x >= (T)IMM ? x - (T)IMM : 0; \ } Passed the rv64gcv regression test. Signed-off-by: Li Xu gcc/tests

Re: [PATCH 6/6] simplify-rtx: Simplify ROTATE:HI (X:HI, 8) into BSWAP:HI (X)

2024-10-27 Thread Jeff Law
On 10/24/24 12:24 AM, Kyrylo Tkachov wrote: On 24 Oct 2024, at 07:36, Jeff Law wrote: On 10/22/24 2:26 PM, Kyrylo Tkachov wrote: Hi all, With recent patch to improve detection of vector rotates at RTL level combine now tries matching a V8HImode rotate by 8 in the example in the testcas

Re: [PATCH] xtensa: Define TARGET_DIFFERENT_ADDR_DISPLACEMENT_P target hook

2024-10-27 Thread Max Filippov
On Tue, Oct 22, 2024 at 7:31 PM Takayuki 'January June' Suwa wrote: > > In commit bc5a9dab55d13f888a3cdd150c8cf5c2244f35e0 ("gcc: xtensa: reorder > movsi_internal patterns for better code generation during LRA"), the > instruction order in "movsi_internal" MD definition was changed to make LRA >

[r15-4702 Regression] FAIL: gfortran.dg/pr115070.f90 -O (test for excess errors) on Linux/x86_64

2024-10-27 Thread haochen.jiang
On Linux/x86_64, ed8ca972f8857869d2bb4a416994bb896eb1c34e is the first bad commit commit ed8ca972f8857869d2bb4a416994bb896eb1c34e Author: Paul Thomas Date: Sun Oct 27 12:40:42 2024 + Fortran: Fix regressions with intent(out) class[PR115070, PR115348]. caused FAIL: gfortran.dg/pr11507

Re: [PATCH 4/6] aarch64: Optimize vector rotates into REV* instructions where possible

2024-10-27 Thread Kyrylo Tkachov
> On 25 Oct 2024, at 15:25, Richard Sandiford wrote: > > Kyrylo Tkachov writes: >>> On 25 Oct 2024, at 13:46, Richard Sandiford >>> wrote: >>> >>> Kyrylo Tkachov writes: Thank you for the suggestions! I’m trying them out now. >> + if (rotamnt % BITS_PER_UNIT != 0) >> +

[PATCH 6/6] simplify-rtx: Simplify ROTATE:HI (X:HI, 8) into BSWAP:HI (X)

2024-10-27 Thread Kyrylo Tkachov
Hi all, With recent patch to improve detection of vector rotates at RTL level combine now tries matching a V8HImode rotate by 8 in the example in the testcase. We can teach AArch64 to emit a REV16 instruction for such a rotate but really this operation corresponds to the RTL code BSWAP, for which

[PATCH 4/6] expmed, aarch64: Optimize vector rotates as vector permutes where possible

2024-10-27 Thread Kyrylo Tkachov
Hi all, Some vector rotate operations can be implemented in a single instruction rather than using the fallback SHL+USRA sequence. In particular, when the rotate amount is half the bitwidth of the element we can use a REV64,REV32,REV16 instruction. More generally, rotates by a byte amount can be i

[PATCH 5/6] aarch64: Emit XAR for vector rotates where possible

2024-10-27 Thread Kyrylo Tkachov
Hi all, We can make use of the integrated rotate step of the XAR instruction to implement most vector integer rotates, as long we zero out one of the input registers for it. This allows for a lower-latency sequence than the fallback SHL+USRA, especially when we can hoist the zeroing operation awa

[PATCH 3/6] PR 117048: aarch64: Add define_insn_and_split for vector ROTATE

2024-10-27 Thread Kyrylo Tkachov
The ultimate goal in this PR is to match the XAR pattern that is represented as a (ROTATE (XOR X Y) VCST) from the ACLE intrinsics code in the testcase. The first blocker for this was the missing recognition of ROTATE in simplify-rtx, which is fixed in the previous patch. The next problem is that o

[PATCH 1/6] PR 117048: simplify-rtx: Simplify (X << C1) [+,^] (X >> C2) into ROTATE

2024-10-27 Thread Kyrylo Tkachov
Hi all, simplify-rtx can transform (X << C1) | (X >> C2) into ROTATE (X, C1) when C1 + C2 == mode-width. But the transformation is also valid for PLUS and XOR. Indeed GIMPLE can also do the fold. Let's teach RTL to do it too. The motivating testcase for this is in AArch64 intrinsics: uint64x2_

[PATCH 2/6] aarch64: Use canonical RTL representation for SVE2 XAR and extend it to fixed-width modes

2024-10-27 Thread Kyrylo Tkachov
Hi all, The MD pattern for the XAR instruction in SVE2 is currently expressed with non-canonical RTL by using a ROTATERT code with a constant rotate amount. Fix it by using the left ROTATE code. This necessitates adjusting the rotate amount during expand. Additionally, as the SVE2 XAR instructi

Re: [PATCH] Add COBOL to gcc (was: Add 'cobol' to Makefile.def)

2024-10-27 Thread James K. Lowden
On Wed, 23 Oct 2024 15:12:19 +0200 Richard Biener wrote: > The rest of the changes look OK to me. Below is a revised patch incorporating recent feedback. Changes: * remove blank lines at EOF * add gcc/cobol/lang.opt.urls * simpllify gcc/cobol/config-lang.in (and FE requires C++) * add stu

[PATCH v2] Fix MV clones can not redirect to specific target on some targets

2024-10-27 Thread Yangyu Chen
Following the implementation of commit b8ce8129a5 ("Redirect call within specific target attribute among MV clones (PR ipa/82625)"), we can now optimize calls by invoking a versioned function callee from a caller that shares the same target attribute. However, on targets that define TARGET_HAS_FMV_

[PATCH] Fix MV clones can not redirect to specific target on some targets

2024-10-27 Thread Yangyu Chen
Following the implementation of commit b8ce8129a5 ("Redirect call within specific target attribute among MV clones (PR ipa/82625)"), we can now optimize calls by invoking a versioned function callee from a caller that shares the same target attribute. However, on targets that define TARGET_HAS_FMV_

[patch, Fortran] Introduce unsigned versions of MASKL and MASKR

2024-10-27 Thread Thomas Koenig
Hello world, MASKR and MASKL are obvious candidates for unsigned, too; in the previous version of the doc patch, I had promised that these would take unsigned arguments in the future. What I had in mind was they could take an unsigned argument and return an unsigned result. Thinking about this a

[Patch, fortran] [13-15 regressions] PR115070 & 115348

2024-10-27 Thread Paul Richard Thomas
Pushed as 'obvious' in commit r15-4702. This patch has been on my tree since July so I thought to get it out of the way before it died of bit-rot. Will backport in a week. Fortran: Fix regressions with intent(out) class[PR115070, PR115348]. 2024-10-27 Paul Thomas gcc/fortran PR fortran/115070

Re: [PATCH v2] testsuite: Sanitize pacbti test cases for Cortex-M

2024-10-27 Thread Torbjorn SVENSSON
On 2024-10-25 12:30, Richard Earnshaw (lists) wrote: On 14/10/2024 13:23, Christophe Lyon wrote: On 10/13/24 19:50, Torbjörn SVENSSON wrote: Ok for trunk and releases/gcc-14? Changes since v1: - Dropped changes to dg- instructions. These will be addressed in a separate set of patches la

Re: [pushed] doc, fortran: Add a missing menu item.

2024-10-27 Thread Iain Sandoe
> On 27 Oct 2024, at 08:08, Thomas Koenig wrote: > > Am 27.10.24 um 00:15 schrieb Iain Sandoe: >> Tested on x86_64-darwin21 and linux, with makeinfo 6.7 pushed to trunk, >> thanks > For the record, makeinfo 6.8 did not show this as an error. Hmm that’s maybe a regression in texinfo 6.8 then,

Re: counted_by attribute and type compatibility

2024-10-27 Thread Martin Uecker
Am Freitag, dem 25.10.2024 um 14:03 + schrieb Qing Zhao: > > > On Oct 25, 2024, at 08:13, Martin Uecker wrote: > > > > > > I agree, and error makes sense. What worries me a little bit > > > > is tying this to a semantic change in type compatibility. > > > > > > > > typedef struct foo { int

Re: [pushed] doc, fortran: Add a missing menu item.

2024-10-27 Thread Thomas Koenig
Am 27.10.24 um 00:15 schrieb Iain Sandoe: Tested on x86_64-darwin21 and linux, with makeinfo 6.7 pushed to trunk, thanks Thanks! For the record, makeinfo 6.8 did not show this as an error. Best regards Thomas

[PATCH v2] [aarch64] Fix function multiversioning dispatcher link error with LTO

2024-10-27 Thread Yangyu Chen
We forgot to apply DECL_EXTERNAL to __init_cpu_features_resolver decl. When building with LTO, the linker cannot find the __init_cpu_features_resolver.lto_priv* symbol, causing the link error. This patch get this fixed by adding DECL_EXTERNAL to the decl. To avoid used but never defined warning fo