This patch transforms the following POW calls to equivalent LDEXP calls, as
discussed in PR57492:
powi (2.0, i) -> ldexp (1.0, i)
a * powi (2.0, i) -> ldexp (a, i)
2.0 * powi (2.0, i) -> ldexp (1.0, i + 1)
pow (powof2, i) -> ldexp (1.0, i * log2 (powof2))
powof2 * pow (2, i) -> ldexp (1.0, i +
This patch implements transformations for the following optimizations.
logN(x) CMP CST -> x CMP expN(CST)
expN(x) CMP CST -> x CMP logN(CST)
For example:
int
foo (float x)
{
return __builtin_logf (x) < 0.0f;
}
can just be:
int
foo (float x)
{
return x < 1.0f;
}
The patch was bootstrapped
On Linux/x86_64,
b281e13ecad12d07209924a7282c53be3a1c3774 is the first bad commit
commit b281e13ecad12d07209924a7282c53be3a1c3774
Author: Jonathan Wakely
Date: Tue Oct 8 21:15:18 2024 +0100
libstdc++: Add P1206R7 from_range members to std::vector [PR111055]
caused
FAIL: 23_containers/vec
This moves the check for maybe_undef_p in match_simplify_replacement
slightly earlier before figuring out the true/false arg using arg0/arg1
instead.
In most cases this is no difference in compile time; just in the case
there is an undef in the args there would be a slight compile time
improvement
ABSU_EXPR lowering incorrectly used the resulting type
for the new expression but in the case of ABSU the resulting
type is an unsigned type and with ABSU is folded away. The fix
is to use a signed type for the expression instead.
Bootstrapped and tested on x86_64-linux-gnu.
PR middle-end
On 10/25/24 5:54 AM, Alexandre Oliva wrote:
Prepare for ifcombining noncontiguous blocks, adding (still unused)
logic to the ifcombine profile updater to handle such cases.
for gcc/ChangeLog
* tree-ssa-ifcombine.cc (known_succ_p): New.
(update_profile_after_ifcombine): Han
On 10/22/24 12:26 AM, KuanLin Chen wrote:
In the origin, cc1 registers rvv builtins with turn on all sub vector
extensions but lto not. It makes lto use the asynchronous DECL_MD_FUNCTION_CODE
from lto-objects.
Example:
riscv64-unknown-elf-gcc -flto gcc/testsuite/gcc.target/riscv/rvv/base/bug
On 10/22/24 12:24 AM, KuanLin Chen wrote:
The GTY skip makes GGC clean the registered functions wrongly in lto.
Example:
riscv64-unknown-elf-gcc -flto gcc/testsuite/gcc.target/riscv/rvv/base/bug-3.c
-O2 -march=rv64gcv
In file included from bug-3.c:2: internal compiler error: Segmentation fau
On 10/25/24 5:52 AM, Alexandre Oliva wrote:
Refactor ifcombine_ifandif, moving the common code from the various
paths that apply the combined condition to a new function.
for gcc/ChangeLog
* tree-ssa-ifcombine.cc (ifcombine_replace_cond): Factor out
of...
(ifcombin
If the parameter is not lvalue-convertible to bool then the current code
will fail to compile. The parameter should be forwarded to restore the
original value category.
libstdc++-v3/ChangeLog:
* include/bits/stl_bvector.h (emplace_back, emplace): Forward
parameter pack to preserve
On 10/18/24 8:22 AM, Robin Dapp wrote:
This patch adds else operands to masked loads. Currently the default
else operand predicate accepts "undefined" (i.e. SCRATCH) as well as
all-ones values.
Note that this series introduces a large number of new RVV FAILs for
riscv. All of them are due t
On 10/25/24 5:51 AM, Alexandre Oliva wrote:
In preparation to changes that may modify both inner and outer
conditions in ifcombine, drop the redundant parameter result_inv, that
is always identical to inner_inv.
for gcc/ChangeLog
* tree-ssa-ifcombine.cc (ifcombine_ifandif): Drop r
On 10/25/24 5:50 AM, Alexandre Oliva wrote:
Disallowing vuses in blocks for ifcombine is too strict, and it
prevents usefully moving fold_truth_andor into ifcombine. That
tree-level folder has long ifcombined loads, absent other relevant
side effects.
for gcc/ChangeLog
* tree-ssa
On 10/24/24 7:22 PM, Li Xu wrote:
From: xuli
form2:
T __attribute__((noinline)) \
sat_u_sub_imm##IMM##_##T##_fmt_2 (T x) \
{ \
return x >= (T)IMM ? x - (T)IMM : 0; \
}
Passed the rv64gcv regression test.
Signed-off-by: Li Xu
gcc/tests
On 10/24/24 12:24 AM, Kyrylo Tkachov wrote:
On 24 Oct 2024, at 07:36, Jeff Law wrote:
On 10/22/24 2:26 PM, Kyrylo Tkachov wrote:
Hi all,
With recent patch to improve detection of vector rotates at RTL level
combine now tries matching a V8HImode rotate by 8 in the example in the
testcas
On Tue, Oct 22, 2024 at 7:31 PM Takayuki 'January June' Suwa
wrote:
>
> In commit bc5a9dab55d13f888a3cdd150c8cf5c2244f35e0 ("gcc: xtensa: reorder
> movsi_internal patterns for better code generation during LRA"), the
> instruction order in "movsi_internal" MD definition was changed to make LRA
>
On Linux/x86_64,
ed8ca972f8857869d2bb4a416994bb896eb1c34e is the first bad commit
commit ed8ca972f8857869d2bb4a416994bb896eb1c34e
Author: Paul Thomas
Date: Sun Oct 27 12:40:42 2024 +
Fortran: Fix regressions with intent(out) class[PR115070, PR115348].
caused
FAIL: gfortran.dg/pr11507
> On 25 Oct 2024, at 15:25, Richard Sandiford wrote:
>
> Kyrylo Tkachov writes:
>>> On 25 Oct 2024, at 13:46, Richard Sandiford
>>> wrote:
>>>
>>> Kyrylo Tkachov writes:
Thank you for the suggestions! I’m trying them out now.
>> + if (rotamnt % BITS_PER_UNIT != 0)
>> +
Hi all,
With recent patch to improve detection of vector rotates at RTL level
combine now tries matching a V8HImode rotate by 8 in the example in the
testcase. We can teach AArch64 to emit a REV16 instruction for such a rotate
but really this operation corresponds to the RTL code BSWAP, for which
Hi all,
Some vector rotate operations can be implemented in a single instruction
rather than using the fallback SHL+USRA sequence.
In particular, when the rotate amount is half the bitwidth of the element
we can use a REV64,REV32,REV16 instruction.
More generally, rotates by a byte amount can be i
Hi all,
We can make use of the integrated rotate step of the XAR instruction
to implement most vector integer rotates, as long we zero out one
of the input registers for it. This allows for a lower-latency sequence
than the fallback SHL+USRA, especially when we can hoist the zeroing operation
awa
The ultimate goal in this PR is to match the XAR pattern that is represented
as a (ROTATE (XOR X Y) VCST) from the ACLE intrinsics code in the testcase.
The first blocker for this was the missing recognition of ROTATE in
simplify-rtx, which is fixed in the previous patch.
The next problem is that o
Hi all,
simplify-rtx can transform (X << C1) | (X >> C2) into ROTATE (X, C1) when
C1 + C2 == mode-width. But the transformation is also valid for PLUS and XOR.
Indeed GIMPLE can also do the fold. Let's teach RTL to do it too.
The motivating testcase for this is in AArch64 intrinsics:
uint64x2_
Hi all,
The MD pattern for the XAR instruction in SVE2 is currently expressed with
non-canonical RTL by using a ROTATERT code with a constant rotate amount.
Fix it by using the left ROTATE code. This necessitates adjusting the rotate
amount during expand.
Additionally, as the SVE2 XAR instructi
On Wed, 23 Oct 2024 15:12:19 +0200
Richard Biener wrote:
> The rest of the changes look OK to me.
Below is a revised patch incorporating recent feedback. Changes:
* remove blank lines at EOF
* add gcc/cobol/lang.opt.urls
* simpllify gcc/cobol/config-lang.in (and FE requires C++)
* add stu
Following the implementation of commit b8ce8129a5 ("Redirect call
within specific target attribute among MV clones (PR ipa/82625)"),
we can now optimize calls by invoking a versioned function callee
from a caller that shares the same target attribute. However, on
targets that define TARGET_HAS_FMV_
Following the implementation of commit b8ce8129a5 ("Redirect call
within specific target attribute among MV clones (PR ipa/82625)"),
we can now optimize calls by invoking a versioned function callee
from a caller that shares the same target attribute. However, on
targets that define TARGET_HAS_FMV_
Hello world,
MASKR and MASKL are obvious candidates for unsigned, too; in the
previous version of the doc patch, I had promised that these would
take unsigned arguments in the future. What I had in mind was
they could take an unsigned argument and return an unsigned result.
Thinking about this a
Pushed as 'obvious' in commit r15-4702. This patch has been on my tree
since July so I thought to get it out of the way before it died of bit-rot.
Will backport in a week.
Fortran: Fix regressions with intent(out) class[PR115070, PR115348].
2024-10-27 Paul Thomas
gcc/fortran
PR fortran/115070
On 2024-10-25 12:30, Richard Earnshaw (lists) wrote:
On 14/10/2024 13:23, Christophe Lyon wrote:
On 10/13/24 19:50, Torbjörn SVENSSON wrote:
Ok for trunk and releases/gcc-14?
Changes since v1:
- Dropped changes to dg- instructions. These will be addressed in a separate
set of patches la
> On 27 Oct 2024, at 08:08, Thomas Koenig wrote:
>
> Am 27.10.24 um 00:15 schrieb Iain Sandoe:
>> Tested on x86_64-darwin21 and linux, with makeinfo 6.7 pushed to trunk,
>> thanks
> For the record, makeinfo 6.8 did not show this as an error.
Hmm that’s maybe a regression in texinfo 6.8 then,
Am Freitag, dem 25.10.2024 um 14:03 + schrieb Qing Zhao:
>
> > On Oct 25, 2024, at 08:13, Martin Uecker wrote:
> >
> > > > I agree, and error makes sense. What worries me a little bit
> > > > is tying this to a semantic change in type compatibility.
> > > >
> > > > typedef struct foo { int
Am 27.10.24 um 00:15 schrieb Iain Sandoe:
Tested on x86_64-darwin21 and linux, with makeinfo 6.7 pushed to trunk,
thanks
Thanks!
For the record, makeinfo 6.8 did not show this as an error.
Best regards
Thomas
We forgot to apply DECL_EXTERNAL to __init_cpu_features_resolver decl. When
building with LTO, the linker cannot find the
__init_cpu_features_resolver.lto_priv* symbol, causing the link error.
This patch get this fixed by adding DECL_EXTERNAL to the decl. To avoid used but
never defined warning fo
34 matches
Mail list logo