[x86_64 PATCH] Support read-modify-write memory operands in STV.

2024-08-31 Thread Roger Sayle
xmm0 vmovdqa %xmm0, m(%rip) ret This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-31 Roger Sayle gcc/ChangeLog * config/i386/i386-feature

[x86_64 PATCH] Update STV's gains for TImode arithmetic right shifts on AVX2.

2024-08-24 Thread Roger Sayle
ithout --target_board=unix{-m32} with no new failures. No new testcase (yet) as the code for both the vector and scalar forms of the above function are still suboptimal so code generation is in flux, but this improvement should be a step in the right direction. Ok for mainline? 2024-08-24 Roger Sayle

[x86_64 PATCH] Support wide immediate constants in STV.

2024-08-15 Thread Roger Sayle
instruction. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-15 Roger Sayle gcc/ChangeLog * config/i386/i386-features.cc (timode_immed_const_gain): New

[x86 PATCH] Improve split of *extendv2di2_highpart_stv_noavx512vl.

2024-08-15 Thread Roger Sayle
which applies when not performing the above optimization, i.e. on TARGET_XOP. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-15 Roger Sayle Uros B

RE: [PATCH] Re-add calling emit_clobber in lower-subreg.cc's resolve_simple_move.

2024-08-13 Thread Roger Sayle
Hi Xianmiao, I have no objection to reverting that original patch, if it was indeed made obsolete by later changes to the i386 backend. The theory at the time was that it was possible for backends to define mov instructions that emitted clobbers if necessary, but it's very difficult for a backen

[x86 PATCH] PR target/116275: Handle STV of *extenddi2_doubleword_highpart

2024-08-11 Thread Roger Sayle
h and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-11 Roger Sayle gcc/ChangeLog PR target/116275 * config/i386/i386.md (*extendv2di2_highpart_stv_noavx512vl): New define_insn_and_split to handle the STV conversion of the DImode pa

[x86 PATCH] Tweak ix86_mode_can_transfer_bits to restore bootstrap on RHEL.

2024-08-08 Thread Roger Sayle
DFmode being "non-literal types in constant expressions". This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, with no new failures. Ok for mainline? 2024-08-08 Roger Sayle gcc/ChangeLog * config/i386/i386.cc (ix86_mode_can_transfer_bit

RE: [x86_64 PATCH] Refactor V2DI arithmetic right shift expansion for STV.

2024-08-07 Thread Roger Sayle
e has been committed as obvious. Sorry again for the inconvenience. Tested on x86_64-pc-linux-gnu with RUNTESTFLAGS="dg.exp=sse2-pr85572-1.C". 2024-08-07 Roger Sayle gcc/testsuite/ChangeLog * g++.dg/other/sse2-pr85572-1.C: Update expected output after my recent patc

[x86_64 PATCH] Support memory destinations and wide immediate constants in STV.

2024-08-05 Thread Roger Sayle
-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-05 Roger Sayle gcc/ChangeLog * config/i386/i386-features.cc (timode_immed_const_gain): New function to determine the gain/cost on a CONST_

[x86_64 PATCH] Refactor V2DI arithmetic right shift expansion for STV.

2024-08-05 Thread Roger Sayle
ficial). This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-08-05 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_v2di_ashiftrt): New

[PATCH] PR tree-optimization/57371: Optimize (float)i == 16777222.0f sometimes.

2024-07-28 Thread Roger Sayle
make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? If the testcases need to be tweaked for non-IEEE targets (the transformations themselves should be portable to VAX and IBM floating point formats) hopefully that can be done as follow-up patches

[nvptx PATCH] Implement isfinite and isnormal optabs in nvptx.md.

2024-07-27 Thread Roger Sayle
/pipermail/gcc-patches/2024-July/657881.html [which I'm sad to see is taking a while to review/get approved]. Ok for mainline? 2024-07-27 Roger Sayle gcc/ChangeLog * config/nvptx/nptx.md (UNSPEC_COPYSIGN): No longer required. (UNSPEC_ISFINITE): New UNSPEC. (

[match.pd PATCH] Fold ctz(-x) as ctz(x).

2024-07-23 Thread Roger Sayle
with no new failures. Ok for mainline? 2024-07-23 Roger Sayle gcc/ChangeLog * match.pd (ctz (-X) => ctz (X)): New simplification. gcc/testsuite/ChangeLog * gcc.dg/fold-ctz-1.c: New test case. Thanks in advance, Roger -- diff --git a/gcc/match.pd b/gcc/match.pd index 6818856..

[testsuite PATCH] Robustify lib/g++.exp

2024-07-22 Thread Roger Sayle
#x27;s no harm in (also) confirming that it exists in g++_include_flags. This patch has been tested on x86_64-pc-linux-gnu (where it allows a cross-compiler to arc-linux to produce g++ compilation results). Ok for mainline? 2024-07-22 Roger Sayle gcc/testsuite/ChangeLog * lib/g++.

[ARC PATCH] Improve performance of SImode right shifts (take #2)

2024-07-22 Thread Roger Sayle
opsys, is anyone able to test these changes? Thanks in advance. 2024-07-22 Roger Sayle gcc/ChangeLog * config/arc/arc-protos.h (output_rlc_loop): Prototype here. (arc_split_rlc): Prototype here. * config/arc/arc.cc (output_rlc_loop): Output a zero-overhead loop o

[PATCH] Implement a -ftrapping-math/-fsignaling-nans TODO in match.pd.

2024-07-17 Thread Roger Sayle
e? 2024-07-17 Roger Sayle gcc/ChangeLog * match.pd ((FTYPE) N CMP CST): Only worry about exceptions with flag_trapping_math, and about signaling NaNs with HONOR_SNANS. gcc/testsuite/ChangeLog * c-c++-common/pr57371-4.c: Update comment. * c-c++-common/pr57371-5

RE: [PATCH] Use foreach, not lmap, for tcl <= 8.5 compat

2024-07-16 Thread Roger Sayle
Hi Jørgen, Awesome. Very many thanks for the speedy fix. Roger -- > -Original Message- > From: Jørgen Kvalsvik > Sent: 14 July 2024 20:46 > To: gcc-patches@gcc.gnu.org > Cc: jeffreya...@gmail.com; ro...@nextmovesoftware.com; Jørgen Kvalsvik > > Subject: [PATCH] Use foreach, not lmap,

Re: [pushed] Add function filtering to gcov

2024-07-14 Thread Roger Sayle
I’m seeing (dejagnu) testsuite problems from this (recent) patch. Running /home/roger/GCC/patchem/gcc/testsuite/gcc.misc-tests/gcov.exp ... ERROR: (DejaGnu) proc "lmap key { snd } { if { $key in $seen } continue set key }" does not exist. The error code is NONE The info on th

[x86 PATCH] Tweak i386-expand.cc to restore bootstrap on RHEL.

2024-07-14 Thread Roger Sayle
ke bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures (from this change). Ok for mainline? 2024-07-14 Roger Sayle * config/i386/i386-expand.cc (ix86_expand_fp_absneg_operator): Use E_?Fmode enumeration constants in switch statement.

[match.pd PATCH] PR tree-optimization/114661: Generalize MULT_EXPR recognition (take #2)

2024-07-14 Thread Roger Sayle
tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-14 Roger Sayle Richard Biener gcc/ChangeLog PR tree-optimization/114661 * match.pd ((X*C1)|(X*C2) to

[x86 SSE PATCH] Some AVX512 ternlog expansion refinements (take #2)

2024-07-11 Thread Roger Sayle
line? 2024-07-11 Roger Sayle Hongtao Liu gcc/ChangeLog * config/i386/i386-expand.cc (ix86_broadcast_from_constant): Use CONST_VECTOR_P instead of comparison against GET_CODE. (ix86_gen_bcst_mem): Likewise. (ix86_ternlog_le

[ARC PATCH] Improve performance of SImode right shifts.

2024-07-11 Thread Roger Sayle
ns@16 cycles This patch has been minimally tested by building a cross-compiler to arc-linux hosted on x86_64-pc-linux-gnu where there are no new failures from "make -k check" in the compile-only tests. Ok for mainline (after 3rd-party testing)? 2024-07-11 Roger Sayle gcc/ChangeLog

[nvptx PATCH] Implement rtx_costs target hook for nvptx backend.

2024-07-11 Thread Roger Sayle
s 4.123190 seconds So about a 3.7x performance improvement. This patch has been tested with make and make -k check for nvptx-none hosted on x86_64-pc-linux-gnu with no new failures. Ok for mainline? 2024-07-11 Roger Sayle gcc/ChangeLog * config/nvptx/nvptx.cc (nvptx_rtx_size_costs): New f

[match.pd PATCH] PR tree-optimization/114661: Generalize MULT_EXPR recognition.

2024-07-09 Thread Roger Sayle
t This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-09 Roger Sayle gcc/ChangeLog PR tree-optimization/114661 * match.pd ((X*C1)|(X*C2

[x86 SSE PATCH] Some AVX512 ternlog expansion refinements.

2024-07-07 Thread Roger Sayle
} with no new failures. Ok for mainline? 2024-07-07 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_broadcast_from_constant): Use CONST_VECTOR_P instead of comparison against GET_CODE. (ix86_gen_bcst_mem): Likewise. (ix86_ternlog_leaf_p): Likewise

[x86 SSE PATCH] PR target/115751: Avoid force_reg in ix86_expand_ternlog.

2024-07-04 Thread Roger Sayle
This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-04 Roger Sayle gcc/ChangeLog PR target/115751 * config/i386/i386-expand.c (ix86_expand_t

[x86 PATCH] Add additional variant of bswaphisi2_lowpart peephole2.

2024-07-01 Thread Roger Sayle
$8, %di jmp ext This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-01 Roger Sayle gcc/ChangeLog * config/i386/i386.md (bswaphisi2_lowpa

[x86 SSE PATCH] Remove legacy ternlog patterns from sse.md

2024-06-30 Thread Roger Sayle
hange. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-30 Roger Sayle gcc/ChangeLog * config/i386/sse.md (*vmov_constm1_pternlog_false_dep):

RE: [x86 PATCH]: Additional peephole2 to use lea in round-up integer division.

2024-06-30 Thread Roger Sayle
Hi Uros, > On Sat, Jun 29, 2024 at 6:21 PM Roger Sayle > wrote: > > A common idiom for implementing an integer division that rounds > > upwards is to write (x + y - 1) / y. Conveniently on x86, the two > > additions to form the numerator can be performed by a single

[testsuite PATCH] Fix -m32 gcc.target/i386/pr102464-vrndscaleph.c on RedHat.

2024-06-30 Thread Roger Sayle
ound is to define __NO_MATH_INLINES before #include (or alternatively use __builtin_floor, __builtin_ceil, etc.). This patch has been tested on x86_64-pc-linux-gnu with make -k check, with and without --target_board=unix{-m32}. Ok for mainline? 2024-06-30 Roger Sayle gcc/testsuite/ChangeLog

[x86 PATCH]: Additional peephole2 to use lea in round-up integer division.

2024-06-29 Thread Roger Sayle
inux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-29 Roger Sayle gcc/ChangeLog * config/i386/i386.md (peephole2): Transform two consecutive additions into a 3-component lea if !TARGET

RE: nvptx vs. [PATCH] Add a late-combine pass [PR106594]

2024-06-27 Thread Roger Sayle
.@ventanamicro.com; rdapp@gmail.com; gcc-patches@gcc.gnu.org; > Tom de Vries ; Roger Sayle > Subject: Re: nvptx vs. [PATCH] Add a late-combine pass [PR106594] > > Hi! > > On 2024-06-27T22:27:21+0200, I wrote: > > On 2024-06-27T18:49:17+0200, I wrote: > >> On 2023-10-

[x86 PATCH] Handle sign_extend like zero_extend in *concatditi3_[346]

2024-06-27 Thread Roger Sayle
with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-27 Roger Sayle gcc/ChangeLog * config/i386/i386.md (*concat3_3): Change zero_extend to any_extend in first operand to left shift by mode precision. (*concat3_4): Likewise.

[x86 SSE PATCH] Some additional ternlog refinements.

2024-06-27 Thread Roger Sayle
ently use decimal. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-27 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_ternlo

[ARC PATCH] Improved SImode conditional moves (improves DImode shifts).

2024-06-22 Thread Roger Sayle
sue is also described at https://github.com/foss-for-synopsys-dwc-arc-processors/gcc/issues/110 Tested with a cross-compiler to arc-linux hosted on x86_64, with no new (compile-only) regressions from make -k check. Ok for mainline if this passes Claudiu's and/or Jeff's testing? 20

[PATCH v2] PR tree-opt/113673: Avoid load merging when potentially trapping.

2024-06-21 Thread Roger Sayle
ke bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-21 Roger Sayle Richard Biener gcc/ChangeLog PR tree-optimization/113673 * gimple-ssa-store-merging.cc (find_bswap_or_nop_lo

[x86 PATCH] Allow all register_operand SUBREGs in x86_ternlog_idx.

2024-06-18 Thread Roger Sayle
ode V4SF. This patch allows the recently added ternlog_operand to accept this case. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-18 Roger Sayle gcc/C

[x86 PATCH] More use of m{32,64}bcst addressing modes with ternlog.

2024-06-12 Thread Roger Sayle
ret// 1 = 42 total This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-12 Roger Sayle gcc/ChangeLog * config/i386/i38

[x86 PATCH] PR target/115397: AVX512 ternlog vs. -m32 -fPIC constant pool.

2024-06-10 Thread Roger Sayle
x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-06-10 Roger Sayle gcc/ChangeLog PR target/115397 * config/i386/i386-expand.cc (ix86_expand_te

[analyzer PATCH] Restore bootstrap with g++ 4.8.

2024-06-07 Thread Roger Sayle
using "scl enable devetoolset-10") as host compilers. Ok for mainline? 2024-06-07 Roger Sayle gcc/analyzer/ChangeLog * constraint-manager.cc (equiv_class::make_dump_widget): Use std::move to return a std::unique_ptr. (bounded_ranges_constraint::make_dump_wi

[x86 PATCH] PR target/115351: RTX costs for *concatditi3 and *insvti_highpart.

2024-06-07 Thread Roger Sayle
e? 2024-06-07 Roger Sayle gcc/ChangeLog PR target/115351 * config/i386/i386.cc (ix86_rtx_costs): Provide estimates for the *concatditi3 and *insvti_highpart patterns, about two insns. gcc/testsuite/ChangeLog PR target/115351 * g++.target/i386/pr1153

[x86_64 PATCH] Correct insn_cost of movabsq.

2024-05-22 Thread Roger Sayle
e -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-22 Roger Sayle gcc/ChangeLog * config/i386/i386.cc (ix86_rtx_costs) : A CONST_INT that isn't x86_64_immediate_operand requires an extra (expensive) movabsq in

[PATCH] Avoid ICE in except.cc on targets that don't support exceptions.

2024-05-22 Thread Roger Sayle
This patch has been tested on nvptx-none hosted on x86_64-pc-linux-gnu with no new failures in the testsuite, and ~220 fewer FAILs. Ok for mainline? 2024-05-22 Roger Sayle gcc/ChangeLog * except.cc (output_function_exception_table): Move call to get_personality

[x86 SSE] Improve handling of ternlog instructions in i386/sse.md (v2)

2024-05-17 Thread Roger Sayle
64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-17 Roger Sayle Hongtao Liu gcc/ChangeLog PR target/115021 * config/i386/i386-expand.cc (ix86_expand

[x86 SSE] Improve handling of ternlog instructions in i386/sse.md

2024-05-12 Thread Roger Sayle
inux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-12 Roger Sayle gcc/ChangeLog PR target/115021 * config/i386/i386-expand.cc (ix86_expand_args_builtin): Call fixup_modeless_co

Re: [x86 PATCH] Improve V[48]QI shifts on AVX512

2024-05-10 Thread Roger Sayle
his weekend. Thanks again, Roger > From: Hongtao Liu > On Fri, May 10, 2024 at 6:26 AM Roger Sayle > wrote: > > > > > > The following one line patch improves the code generated for V8QI and > > V4QI shifts when AV512BW and AVX512VL functionality is available. &

[x86 PATCH] Improve V[48]QI shifts on AVX512

2024-05-09 Thread Roger Sayle
ch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-05-09 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_expand_vecop_qihi_partial): Don

RE: [PATCH] PR middle-end/111701: signbit(x*x) vs -fsignaling-nans

2024-05-02 Thread Roger Sayle
> From: Richard Biener > On Thu, May 2, 2024 at 11:34 AM Roger Sayle > wrote: > > > > > > > From: Richard Biener On Fri, Apr 26, > > > 2024 at 10:19 AM Roger Sayle > > > wrote: > > > > > > > > This patch address

RE: [PATCH] PR middle-end/111701: signbit(x*x) vs -fsignaling-nans

2024-05-02 Thread Roger Sayle
> From: Richard Biener > On Fri, Apr 26, 2024 at 10:19 AM Roger Sayle > wrote: > > > > This patch addresses PR middle-end/111701 where optimization of > > signbit(x*x) using tree_nonnegative_p incorrectly eliminates a > > floating point multiplication whe

RE: [C PATCH] PR c/109618: ICE-after-error from error_mark_node.

2024-04-30 Thread Roger Sayle
> On Tue, Apr 30, 2024 at 10:23 AM Roger Sayle > wrote: > > Hi Richard, > > Thanks for looking into this. > > > > It’s not the call to size_binop_loc (for CEIL_DIV_EXPR) that's > > problematic, but the call to fold_convert_loc (loc, size_type_node,

RE: [C PATCH] PR c/109618: ICE-after-error from error_mark_node.

2024-04-30 Thread Roger Sayle
which does more of a tree traversal checking error_operand_p within the unary and binary operators of an expression tree. Please let me know what you think/recommend. Best regards, Roger -- > -Original Message- > From: Richard Biener > Sent: 30 April 2024 08:38 > To: Roger Sayle >

[C PATCH] PR c/109618: ICE-after-error from error_mark_node.

2024-04-29 Thread Roger Sayle
ng away) a CEIL_DIV_EXPR in the common case that "char" is a single-byte. The current code relies on the middle-end's tree folding to recognize that CEIL_DIV_EXPR of integer_one_node is a no-op, that can be optimized away. Ok for mainline? 2024-04-30 Roger Sayle gcc/c-family/Chan

[PATCH] PR tree-opt/113673: Avoid load merging from potentially trapping additions.

2024-04-28 Thread Roger Sayle
updating the CFG is a part of the compiler that I'm less familiar with. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-04-28 Roger Sayle gcc/ChangeL

[PATCH] PR middle-end/111701: signbit(x*x) vs -fsignaling-nans

2024-04-26 Thread Roger Sayle
c-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-04-26 Roger Sayle gcc/ChangeLog PR middle-end/111701 * fold-const.cc (tree_binary_nonnegative_warnv_p) : Split handling of flo

[PATCH] PR target/114187: Fix ?Fmode SUBREG simplification in simplify_subreg.

2024-03-03 Thread Roger Sayle
added/modified potentially contributed to this lapse. Using lowpart_subreg should avoid/reduce confusion in future. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for ma

[x86_64 PATCH] PR target/113690: Fix-up MULT REG_EQUAL notes in STV.

2024-02-04 Thread Roger Sayle
64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-02-05 Roger Sayle gcc/ChangeLog PR target/113690 * config/i386/i386-features.cc (timode_convert_cst): New helper functi

[tree-ssa PATCH] PR target/113560: Enhance is_widening_mult_rhs_p.

2024-01-29 Thread Roger Sayle
ootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-01-30 Roger Sayle gcc/ChangeLog PR target/113560 * tree-ssa-math-opts.cc (is_widening_mult_rhs_p): Use range information via tree_non_zero_bits to check i

[libatomic PATCH] PR other/113336: Fix libatomic testsuite regressions on ARM.

2024-01-28 Thread Roger Sayle
This patch is a revised version of the fix for PR other/113336. This patch has been tested on arm-linux-gnueabihf with --with-arch=armv6 with make bootstrap and make -k check where it fixes all of the FAILs in libatomic. Ok for mainline? 2024-01-28 Roger Sayle Victor Do

[middle-end PATCH] Constant fold {-1,-1} << 1 in simplify-rtx.cc

2024-01-26 Thread Roger Sayle
n now checks that VEC_SELECT or some funky (future) rtx_code doesn't cause problems. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline (in stage 1)? 2024-01-26 Roger Sa

RE: [x86 PATCH] PR target/106060: Improved SSE vector constant materialization.

2024-01-25 Thread Roger Sayle
no new failures. Ok for mainline (in stage 1)? 2024-01-25 Roger Sayle Hongtao Liu gcc/ChangeLog PR target/106060 * config/i386/i386-expand.cc (enum ix86_vec_bcast_alg): New. (struct ix86_vec_bcast_map_simode_t): New type for table below. (ix86_vec

RE: [middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.

2024-01-19 Thread Roger Sayle
-level might lead to a code quality regression, if RTL expansion doesn't know to lower it back to use PLUS on those targets with lea but without rotate. > From: Richard Biener > Sent: 19 January 2024 11:04 > On Thu, Jan 18, 2024 at 8:55 PM Roger Sayle > wrote: > > > > T

[middle-end PATCH] Prefer PLUS over IOR in RTL expansion of multi-word shifts/rotates.

2024-01-18 Thread Roger Sayle
add2r1,r2,r1 j_s.d [blink] add2r0,r3,r0 This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-01-18 Roger Sayle gcc/ChangeLog

[x86 PATCH] PR target/106060: Improved SSE vector constant materialization.

2024-01-16 Thread Roger Sayle
gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-01-16 Roger Sayle gcc/ChangeLog PR target/106060 * config/i386/i386-expand.cc (enum ix86_vec_bcast_alg): New. (struct ix86_vec_bcas

[PATCH] PR rtl-optimization/111267: Improved forward propagation.

2024-01-15 Thread Roger Sayle
%xmm2, %xmm1 setnb %al ret .L6:xorl%eax, %eax ret This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Additionally, it also resolves the FAIL for gcc.target/

[PATCH/RFC] Add --with-dwarf4 configure option.

2024-01-14 Thread Roger Sayle
. do the right thing. In fact, I'd originally misread the documentation and assumed --with-dwarf4 was already supported. 2024-01-14 Roger Sayle gcc/ChangeLog * configure.ac: Add a with --with dwarf4 option. * configure: Regenerate. * confi

RE: [libatomic PATCH] Fix testsuite regressions on ARM [raspberry pi].

2024-01-11 Thread Roger Sayle
ibatomic doesn't have a (multi-threaded) run-time test to search for race conditions, and confirm its implementations are correctly serializing. Please let me know what you think. Best regards, Roger -- > -Original Message- > From: Richard Earnshaw > Sent: 10 January 2024 15:34

[libatomic PATCH] Fix testsuite regressions on ARM [raspberry pi].

2024-01-08 Thread Roger Sayle
unresolved testcases]. If this looks like the correct fix, I'm not confident with rebuilding Makefile.in with correct version of automake, so I'd very much appreciate it if someone/the reviewer/mainainer could please check this in for me. Thanks in advance. 2024-01-08 Roger Sayle

RE: [x86_64 PATCH] PR target/112992: Optimize mode for broadcast of constants.

2024-01-06 Thread Roger Sayle
pr102021.c: Likewise. * gcc.target/i386/pr90773-17.c: Likewise. Thanks in advance. Roger -- > -Original Message- > From: Hongtao Liu > Sent: 02 January 2024 05:40 > To: Roger Sayle > Cc: gcc-patches@gcc.gnu.org; Uros Bizjak > Subject: Re: [x86_64 PATCH] PR target/112992: Opti

[x86 PATCH] PR target/113231: Improved costs in Scalar-To-Vector (STV) pass.

2024-01-06 Thread Roger Sayle
6_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-01-06 Roger Sayle gcc/ChangeLog PR target/113231 * config/i386/i386-features.cc (compute_convert_gain): Include the ove

[middle-end PATCH take #2] Only call targetm.truly_noop_truncation for truncations.

2023-12-31 Thread Roger Sayle
h has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? Hopefully this revision tests cleanly on the linaro.org CI pipeline. 2023-12-31 Roger Sayle gcc/ChangeLog * combine

RE: [x86_PATCH] peephole2 to resolve failure of gcc.target/i386/pr43644-2.c

2023-12-31 Thread Roger Sayle
Hi Uros, > From: Uros Bizjak > Sent: 28 December 2023 10:33 > On Fri, Dec 22, 2023 at 11:14 AM Roger Sayle > wrote: > > > > This patch resolves the failure of pr43644-2.c in the testsuite, a > > code quality test I added back in July, that started failing as the

RE: [PATCH] Improved RTL expansion of field assignments into promoted registers.

2023-12-28 Thread Roger Sayle
Hi Jeff, Thanks for the speedy review. > On 12/28/23 07:59, Roger Sayle wrote: > > This patch fixes PR rtl-optmization/104914 by tweaking/improving the > > way that fields are written into a pseudo register that needs to be > > kept sign extended. > Well, I think "

[PATCH] MIPS: Implement TARGET_INSN_COSTS

2023-12-28 Thread Roger Sayle
The current (default) behavior is that when the target doesn't define TARGET_INSN_COST the middle-end uses the backend's TARGET_RTX_COSTS, so multiplications are slower than additions, but about the same size when optimizing for size (with -Os or -Oz). All of this gets disabled with your

[middle-end PATCH] Only call targetm.truly_noop_truncation for truncations.

2023-12-28 Thread Roger Sayle
ddle-end that rely on the default behaviour of silently returning true for any (invalid) input. These are fixed below. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainli

[PATCH] Improved RTL expansion of field assignments into promoted registers.

2023-12-28 Thread Roger Sayle
erate much better code for the above test case. Ok for mainline? 2023-12-28 Roger Sayle gcc/ChangeLog PR rtl-optimization/104914 * expr.cc (expand_assignment): When target is SUBREG_PROMOTED_VAR_P a sign or zero extension is only required if the modified f

RE: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-24 Thread Roger Sayle
> > > What's exceedingly weird is T_N_T_M_P (DImode, SImode) isn't > > > actually a truncation! The output precision is first, the input > > > precision is second. The docs explicitly state the output precision > > > should be smaller than the input precision (which makes sense for > > > trunc

RE: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-24 Thread Roger Sayle
> What's exceedingly weird is T_N_T_M_P (DImode, SImode) isn't actually a > truncation! The output precision is first, the input precision is second. > The docs > explicitly state the output precision should be smaller than the input > precision > (which makes sense for truncation). > > That

RE: Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-23 Thread Roger Sayle
> There's a PR in Bugzilla around this representational issue on MIPS, but I can't find > it straight away. Found it. It's PR rtl-optimization/104914, where we've already discussed this in comments #15 and #16. > -Original Message- > From: Roger Sayle

Re: [PATCH v3] EXPR: Emit an truncate if 31+ bits polluted for SImode

2023-12-23 Thread Roger Sayle
Hi YunQiang (and Jeff), > MIPS claims TRULY_NOOP_TRUNCATION_MODES_P (DImode, SImode)) == true > based on that the hard register is always sign-extended, but here > the hard register is polluted by zero_extract. I suspect that the bug here is that the MIPS backend shouldn't be returning true for

[ARC PATCH] Table-driven ashlsi implementation for better code/rtx_costs.

2023-12-23 Thread Roger Sayle
j_s [blink] Tested with a cross-compiler to arc-linux hosted on x86_64, with no new (compile-only) regressions from make -k check. Ok for mainline if this passes Claudiu's and/or Jeff's testing? [Thanks again to Jeff for finding the typo in my last ARC patch] 2023-12-23 Roger Sa

[x86_64 PATCH] PR target/112992: Optimize mode for broadcast of constants.

2023-12-22 Thread Roger Sayle
rap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-12-21 Roger Sayle gcc/ChangeLog PR target/112992 * config/i386/i386-expand.cc (ix86_convert_const_wide_int_to_broadcast)

[x86_PATCH] peephole2 to resolve failure of gcc.target/i386/pr43644-2.c

2023-12-22 Thread Roger Sayle
, %rdx ret which I believe is optimal. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-12-21 Roger Sayle gcc/ChangeLog PR target/43644

[x86 PATCH] Improved TImode (128-bit) integer constants on x86_64.

2023-12-18 Thread Roger Sayle
oard=unix{-m32}, and with/without -march=cascadelake with no new failures. Ok for mainline? 2023-12-18 Roger Sayle gcc/ChangeLog * config/i386/i386-expand.cc (ix86_convert_const_wide_int_to_broadcast): Remove static. (ix86_expand_move): Don't attempt t

[PING] PR112380: Defend against CLOBBERs in RTX expressions in combine.cc

2023-12-10 Thread Roger Sayle
to continue exploring alternate simplifications would also lead to better code generation, but I've not been able to find any examples on x86_64. This patch has been retested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no ne

RE: [ARC PATCH] Add *extvsi_n_0 define_insn_and_split for PR 110717.

2023-12-07 Thread Roger Sayle
ombine doesn't (normally) like turning two instructions into three. Fingers-crossed the attached patch works better on the nightly testers. Thanks in advance, Roger -- > -Original Message- > From: Jeff Law > Sent: 07 December 2023 14:47 > To: Roger Sayle ; gcc-patches@gcc.gn

[ARC PATCH] Add *extvsi_n_0 define_insn_and_split for PR 110717.

2023-12-05 Thread Roger Sayle
ons from make -k check. Ok for mainline if this passes Claudiu's nightly testing? 2023-12-05 Roger Sayle gcc/ChangeLog * config/arc/arc.md (*extvsi_n_0): New define_insn_and_split to implement SImode sign extract using a AND, XOR and MINUS sequence. gcc/testsuite/C

[PATCH] Workaround array_slice constructor portability issues (with older g++).

2023-12-03 Thread Roger Sayle
draws attention to the problem and restores bootstrap whilst better approaches are investigated. For example, an ARRAY_SLICE(table) macro might be appropriate if there isn't an easy/portable template resolution solution. Thoughts? 2023-12-03 Roger Sayle gcc/c-family/ChangeLog

[RISC-V PATCH] Improve style to work around PR 60994 in host compiler.

2023-12-01 Thread Roger Sayle
-linux-gnu using g++ 4.8.5 as the host compiler. Ok for mainline? 2023-12-01 Roger Sayle gcc/ChangeLog * config/riscv/riscv-vsetvl.cc (csetvl_info::parse_insn): Rename local variable from demand_flags to dflags, to avoid conflicting with (enumeration) type of the same name.

[PATCH] PR112380: Defend against CLOBBERs in RTX expressions in combine.cc

2023-11-12 Thread Roger Sayle
ing through the fall-out sufficient for x86_64 to bootstrap and regression test without new failures. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-11-12

[x86 PATCH] Improve reg pressure of double-word right-shift then truncate.

2023-11-12 Thread Roger Sayle
-m32} with no new failures. Ok for mainline? 2023-11-12 Roger Sayle gcc/ChangeLog * config/i386/i386.md (3_doubleword_lowpart): New define_insn_and_split to optimize register usage of doubleword right shifts followed by truncation. Thanks in advance, Roger -- diff --

[ARC PATCH] Consistent use of whitespace in assembler templates.

2023-11-06 Thread Roger Sayle
scan-assembler needed to be updated to use \s+ instead of testing for a TAB or a space explicitly. Tested with a cross-compiler to arc-linux hosted on x86_64, with no new (compile-only) regressions from make -k check. Ok for mainline if this passes Claudiu's nightly testing? 2023-11-06 Roger Sa

[ARC PATCH] Improved DImode rotates and right shifts by one bit.

2023-11-06 Thread Roger Sayle
better. Tested with a cross-compiler to arc-linux hosted on x86_64, with no new (compile-only) regressions from make -k check. Ok for mainline if this passes Claudiu's nightly testing? 2023-11-06 Roger Sayle gcc/ChangeLog * config/arc/arc.md (UNSPEC_ARC_CC_NEZ): New UNSPEC that

[ARC PATCH] Provide a TARGET_FOLD_BUILTIN target hook.

2023-11-03 Thread Roger Sayle
nightly testing? 2023-11-03 Roger Sayle gcc/ChangeLog * config/arc/arc.cc (TARGET_FOLD_BUILTIN): Define to arc_fold_builtin. (arc_fold_builtin): New function. Convert ARC_BUILTIN_SWAP into a rotate. Evaluate ARC_BUILTIN_NORM and ARC_BUILTIN_NORMW of con

[AVR PATCH] Improvements to SImode and PSImode shifts by constants.

2023-11-02 Thread Roger Sayle
hout a simulator, where the compile-only tests in the gcc testsuite show no regressions. If someone could test this more thoroughly that would be great. 2023-11-02 Roger Sayle gcc/ChangeLog * config/avr/avr.cc (ashlqi3_out): Fix indentation whitespace. (ashlhi

[AVR PATCH] Optimize (X>>C)&1 for C in [1, 4, 8, 16, 24] in *insv.any_shift..

2023-11-02 Thread Roger Sayle
avr-elf hosted on x86_64, without a simulator, where the compile-only tests in the gcc testsuite show no regressions. If someone could test this more thoroughly that would be great. 2023-11-02 Roger Sayle gcc/ChangeLog * config/avr/avr.md (*insv.any_shift.): Optimize special

RE: [x86_64 PATCH] PR target/110551: Tweak mulx register allocation using peephole2.

2023-11-01 Thread Roger Sayle
Hi Uros, > From: Uros Bizjak > Sent: 01 November 2023 10:05 > Subject: Re: [x86_64 PATCH] PR target/110551: Tweak mulx register allocation > using peephole2. > > On Mon, Oct 30, 2023 at 6:27 PM Roger Sayle > wrote: > > > > > > This patch is a follow-u

[x86_64 PATCH] PR target/110551: Tweak mulx register allocation using peephole2.

2023-10-30 Thread Roger Sayle
nd without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-10-30 Roger Sayle gcc/ChangeLog PR target/110551 * config/i386/i386.md (*bmi2_umul3_1): Tidy condition as operands[2] with predicate register_operand must be !MEM_P. (peephole2): Optimi

RE: [ARC PATCH] Improve DImode left shift by a single bit.

2023-10-30 Thread Roger Sayle
Hi Jeff, > From: Jeff Law > Sent: 30 October 2023 15:09 > Subject: Re: [ARC PATCH] Improve DImode left shift by a single bit. > > On 10/28/23 07:05, Roger Sayle wrote: > > > > This patch improves the code generated for X << 1 (and for X + X) when > >

[ARC PATCH] Improved ARC rtx_costs/insn_cost for SHIFTs and ROTATEs.

2023-10-29 Thread Roger Sayle
only ~6 cycles, for the shorter shifts by 3 and sign extension. Tested with a cross-compiler to arc-linux hosted on x86_64, with no new (compile-only) regressions from make -k check. Ok for mainline if this passes Claudiu's nightly testing? 2023-10-29 Roger Sayle gcc/ChangeLog *

[ARC PATCH] Convert (signed<<31)>>31 to -(signed&1) without barrel shifter.

2023-10-28 Thread Roger Sayle
check. Ok for mainline if this passes Claudiu's nightly testing? 2023-10-28 Roger Sayle gcc/ChangeLog PR middle-end/101955 * config/arc/arc.md (*extvsi_1_0): New define_insn_and_split to convert sign extract of the least significant bit into an AND $1 t

  1   2   3   4   5   6   7   >