[committed] i386: Handle CONST_WIDE_INT in output_pic_addr_const [PR111340]

2023-09-11 Thread Uros Bizjak via Gcc-patches
PR target/111340 gcc/ChangeLog: * config/i386/i386.cc (output_pic_addr_const): Handle CONST_WIDE_INT. Call output_addr_const for CASE_CONST_SCALAR_INT. gcc/testsuite/ChangeLog: * gcc.target/i386/pr111340.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32

Re: [PATCH 01/13] [APX EGPR] middle-end: Add insn argument to base_reg_class

2023-09-06 Thread Uros Bizjak via Gcc-patches
On Wed, Sep 6, 2023 at 9:43 PM Vladimir Makarov wrote: > > > On 9/1/23 05:07, Hongyu Wang wrote: > > Uros Bizjak via Gcc-patches 于2023年8月31日周四 18:16写道: > >> On Thu, Aug 31, 2023 at 10:20 AM Hongyu Wang wrote: > >>> From: Kong Lingling > >>> >

Re: [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint.

2023-09-04 Thread Uros Bizjak via Gcc-patches
On Mon, Sep 4, 2023 at 2:28 AM Hongtao Liu wrote: > > > > > > > I think there should be some constraint which explicitly has all > > > > > > > the 32 > > > > > > > GPRs, like there is one for just all 16 GPRs (h), so that > > > > > > > regardless of > > > > > > > -mapx-inline-asm-use-gpr32 one

Re: [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint.

2023-09-01 Thread Uros Bizjak via Gcc-patches
On Fri, Sep 1, 2023 at 12:36 PM Hongtao Liu wrote: > > On Fri, Sep 1, 2023 at 5:38 PM Uros Bizjak via Gcc-patches > wrote: > > > > On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang wrote: > > > > > > Uros Bizjak via Gcc-patches 于2023年8月31日周四 > > > 18:

Re: [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint.

2023-09-01 Thread Uros Bizjak via Gcc-patches
On Fri, Sep 1, 2023 at 11:10 AM Hongyu Wang wrote: > > Uros Bizjak via Gcc-patches 于2023年8月31日周四 18:01写道: > > > > On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches > > wrote: > > > > > > On Thu, Aug 31, 2023 at 04:20:17PM +0800

Re: [PATCH 01/13] [APX EGPR] middle-end: Add insn argument to base_reg_class

2023-08-31 Thread Uros Bizjak via Gcc-patches
On Thu, Aug 31, 2023 at 10:20 AM Hongyu Wang wrote: > > From: Kong Lingling > > Current reload infrastructure does not support selective base_reg_class > for backend insn. Add insn argument to base_reg_class for > lra/reload usage. I don't think this is the correct approach. Ideally, a memory co

Re: [PATCH 09/13] [APX EGPR] Handle legacy insn that only support GPR16 (1/5)

2023-08-31 Thread Uros Bizjak via Gcc-patches
On Thu, Aug 31, 2023 at 10:20 AM Hongyu Wang wrote: > > From: Kong Lingling > > These legacy insn in opcode map0/1 only support GPR16, > and do not have vex/evex counterpart, directly adjust constraints and > add gpr32 attr to patterns. > > insn list: > 1. xsave/xsave64, xrstor/xrstor64 > 2. xsav

Re: [PATCH 06/13] [APX EGPR] Map reg/mem constraints in inline asm to non-EGPR constraint.

2023-08-31 Thread Uros Bizjak via Gcc-patches
On Thu, Aug 31, 2023 at 11:18 AM Jakub Jelinek via Gcc-patches wrote: > > On Thu, Aug 31, 2023 at 04:20:17PM +0800, Hongyu Wang via Gcc-patches wrote: > > From: Kong Lingling > > > > In inline asm, we do not know if the insn can use EGPR, so disable EGPR > > usage by default from mapping the comm

[PATCH] fortran: Rename TRUE/FALSE to true/false in *.cc files

2023-08-25 Thread Uros Bizjak via Gcc-patches
gcc/fortran/ChangeLog: * match.cc (gfc_match_equivalence): Rename TRUE/FALSE to true/false. * module.cc (check_access): Ditto. * primary.cc (match_real_constant): Ditto. * trans-array.cc (gfc_trans_allocate_array_storage): Ditto. (get_array_ctor_strlen): Ditto. * trans-comm

[committed] treewide: Rename TRUE/FALSE to true/false in *.cc files

2023-08-25 Thread Uros Bizjak via Gcc-patches
gcc/c-family/ChangeLog: * c-format.cc (read_any_format_width): Rename TRUE/FALSE to true/false. gcc/ChangeLog: * caller-save.cc (new_saved_hard_reg): Rename TRUE/FALSE to true/false. (setup_save_areas): Ditto. * gcc.cc (set_collect_gcc_options): Ditto. (driver::build_

[committed] i386: Optimize pinsrq of 0 with index 1 into movq [PR94866]

2023-08-24 Thread Uros Bizjak via Gcc-patches
Add new pattern involving vec_merge RTX that is produced by combine from the combination of sse4_1_pinsrq and *movdi_internal: 7: r86:DI=0 8: r85:V2DI=vec_merge(vec_duplicate(r86:DI),r87:V2DI,0x2) REG_DEAD r87:V2DI REG_DEAD r86:DI Successfully matched this instruction: (set (re

Re: [PATCH 6/12] i386: Enable _BitInt on x86-64 [PR102989]

2023-08-23 Thread Uros Bizjak via Gcc-patches
On Wed, Aug 9, 2023 at 8:19 PM Jakub Jelinek wrote: > > Hi! > > The following patch enables _BitInt support on x86-64, the only > target which has _BitInt specified in psABI. > > 2023-08-09 Jakub Jelinek > > PR c/102989 > * config/i386/i386.cc (classify_argument): Handle BITINT_

[committed] i386: Fix register spill failure with concat RTX [PR111010]

2023-08-23 Thread Uros Bizjak via Gcc-patches
Disable (=&r,m,m) alternative for 32-bit targets. The combination of two memory operands (possibly with complex addressing mode), early clobbered output, frame pointer and PIC registers uses too many registers on a register constrained 32-bit target. Also merge two similar patterns using DWIH mode

[committed] i386: Micro-optimize ix86_expand_sse_extend

2023-08-20 Thread Uros Bizjak via Gcc-patches
Partial vector src is forced to a register as ops[1], we can use it instead of SRC in the call to ix86_expand_sse_cmp. This change avoids forcing operand[1] to a register in sign/zero-extend expanders. gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_sse_extend): Use ops[1] inste

[committed]: i386: Use PUNPCKL?? to implement vector extend and zero_extend for TARGET_SSE2 [PR111023]

2023-08-18 Thread Uros Bizjak via Gcc-patches
Implement vector extend and zero_extend functionality for TARGET_SSE2 using PUNPCKL?? family of instructions. The code for e.g. zero-extend from V2SI to V2DImode improves from: movd%xmm0, %edx pshufd $85, %xmm0, %xmm0 movd%xmm0, %eax movq%rdx, (%rdi)

Re: [PATCH] Generate vmovapd instead of vmovsd for moving DFmode between SSE_REGS.

2023-08-14 Thread Uros Bizjak via Gcc-patches
On Mon, Aug 14, 2023 at 4:46 AM liuhongt via Gcc-patches wrote: > > vmovapd can enable register renaming and have same code size as > vmovsd. Similar for vmovsh vs vmovaps, vmovaps is 1 byte less than > vmovsh. > > When TARGET_AVX512VL is not available, still generate > vmovsd/vmovss/vmovsh to avo

Re: [PATCH] Support -m[no-]gather -m[no-]scatter to enable/disable vectorization for all gather/scatter instructions.

2023-08-10 Thread Uros Bizjak via Gcc-patches
On Thu, Aug 10, 2023 at 9:40 AM Richard Biener wrote: > > On Thu, Aug 10, 2023 at 3:13 AM liuhongt wrote: > > > > Currently we have 3 different independent tunes for gather > > "use_gather,use_gather_2parts,use_gather_4parts", > > similar for scatter, there're > > "use_scatter,use_scatter_2parts,

Re: [PATCH] Support -m[no-]gather -m[no-]scatter to enable/disable vectorization for all gather/scatter instructions.

2023-08-09 Thread Uros Bizjak via Gcc-patches
On Thu, Aug 10, 2023 at 3:13 AM liuhongt wrote: > > Currently we have 3 different independent tunes for gather > "use_gather,use_gather_2parts,use_gather_4parts", > similar for scatter, there're > "use_scatter,use_scatter_2parts,use_scatter_4parts" > > The patch support 2 standardizing options to

Re: [PATCH] i386: Do not sanitize upper part of V2HFmode and V4HFmode reg with -fno-trapping-math [PR110832]

2023-08-09 Thread Uros Bizjak via Gcc-patches
On Thu, Aug 10, 2023 at 2:49 AM liuhongt wrote: > > Also add ix86_partial_vec_fp_math to to condition of V2HF/V4HF named > patterns in order to avoid generation of partial vector V8HFmode > trapping instructions. > > Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,} > Ok for trunk? > > gcc/

Re: [PATCH] i386: Clear upper bits of XMM register for V4HFmode/V2HFmode operations [PR110762]

2023-08-09 Thread Uros Bizjak via Gcc-patches
On Mon, Aug 7, 2023 at 1:20 PM Richard Biener wrote: > > Please also note the RFC patch [1] that relaxes clears for V2SFmode > > with -fno-trapping-math. The patched compiler will then emit the same > > code as clang does for -O2. Which raises another question - should gcc > > default to -fno-tra

Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy Bridge.

2023-08-08 Thread Uros Bizjak via Gcc-patches
On Wed, Aug 9, 2023 at 8:38 AM Uros Bizjak wrote: > > On Wed, Aug 9, 2023 at 8:37 AM Liu, Hongtao wrote: > > > > > > > > > -Original Message- > > > From: Uros Bizjak > > > Sent: Wednesday, August 9, 2023 2:33 PM > > > To: Liu, Hongtao > > > Cc: gcc-patches@gcc.gnu.org > > > Subject: Re:

Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy Bridge.

2023-08-08 Thread Uros Bizjak via Gcc-patches
On Wed, Aug 9, 2023 at 8:37 AM Liu, Hongtao wrote: > > > > > -Original Message- > > From: Uros Bizjak > > Sent: Wednesday, August 9, 2023 2:33 PM > > To: Liu, Hongtao > > Cc: gcc-patches@gcc.gnu.org > > Subject: Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy > > Bridge. > >

Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy Bridge.

2023-08-08 Thread Uros Bizjak via Gcc-patches
On Wed, Aug 9, 2023 at 3:48 AM liuhongt wrote: > > > Please rather do it in a more self-descriptive way, as proposed in the > > attached patch. You won't need a comment then. > > > > Adjusted in V2 patch. > > Don't access leaf 7 subleaf 1 unless subleaf 0 says it is > supported via EAX. > > Intel

Re: [PATCH V2] [X86] Workaround possible CPUID bug in Sandy Bridge.

2023-08-08 Thread Uros Bizjak via Gcc-patches
On Wed, Aug 9, 2023 at 3:48 AM liuhongt wrote: > > > Please rather do it in a more self-descriptive way, as proposed in the > > attached patch. You won't need a comment then. > > > > Adjusted in V2 patch. > > Don't access leaf 7 subleaf 1 unless subleaf 0 says it is > supported via EAX. > > Intel

[committed] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]

2023-08-08 Thread Uros Bizjak via Gcc-patches
Also introduce -m[no-]partial-vector-fp-math option to disable trapping V2SF named patterns in order to avoid generation of partial vector V4SFmode trapping instructions. The new option is enabled by default, because even with sanitization, a small but consistent speed up of 2 to 3% with Polyhedro

Re: [PATCH] [X86] Workaround possible CPUID bug in Sandy Bridge.

2023-08-08 Thread Uros Bizjak via Gcc-patches
On Tue, Aug 8, 2023 at 9:58 AM liuhongt wrote: > > Don't access leaf 7 subleaf 1 unless subleaf 0 says it is > supported via EAX. > > Intel documentation says invalid subleaves return 0. We had been > relying on that behavior instead of checking the max sublef number. > > It appears that some Sand

Re: [RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]

2023-08-08 Thread Uros Bizjak via Gcc-patches
On Tue, Aug 8, 2023 at 12:08 PM Richard Biener wrote: > > > > > > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF > > > > > > named patterns in order to avoid generation of partial vector > > > > > > V4SFmode > > > > > > trapping instructions. > > > > > > > > > > > > The new

Re: [RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]

2023-08-08 Thread Uros Bizjak via Gcc-patches
On Tue, Aug 8, 2023 at 10:07 AM Richard Biener wrote: > > On Mon, 7 Aug 2023, Uros Bizjak wrote: > > > On Mon, Jul 31, 2023 at 11:40?AM Richard Biener wrote: > > > > > > On Sun, 30 Jul 2023, Uros Bizjak wrote: > > > > > > > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF > >

Re: [RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]

2023-08-07 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 31, 2023 at 11:40 AM Richard Biener wrote: > > On Sun, 30 Jul 2023, Uros Bizjak wrote: > > > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF > > named patterns in order to avoid generation of partial vector V4SFmode > > trapping instructions. > > > > The new option

Re: PR target/107671: Make more use of btl/btq on x86_64.

2023-08-07 Thread Uros Bizjak via Gcc-patches
On Mon, Aug 7, 2023 at 9:37 AM Roger Sayle wrote: > > > This patch is a partial solution to PR target/107671, updating Uros' > patch from comment #4, to catch both bit set (setc) and bit not set > (setnc) cases from the code in comment #2, when compiled on x86_64. > Unfortunately, this is a partia

Re: [PATCH] i386: Clear upper bits of XMM register for V4HFmode/V2HFmode operations [PR110762]

2023-08-07 Thread Uros Bizjak via Gcc-patches
On Mon, Aug 7, 2023 at 10:57 AM liuhongt wrote: > > Similar like r14-2786-gade30fad6669e5, the patch is for V4HF/V2HFmode. > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR target/110762 > * config/i386/mmx.md (3): Changed from

Re: [x86 PATCH] Split SUBREGs of SSE vector registers into vec_select insns.

2023-08-03 Thread Uros Bizjak via Gcc-patches
On Thu, Aug 3, 2023 at 9:10 AM Roger Sayle wrote: > > > This patch is the final piece in the series to improve the ABI issues > affecting PR 88873. The previous patches tackled inserting DFmode > values into V2DFmode registers, by introducing insvti_{low,high}part > patterns. This patch improves

Re: [x86 PATCH] PR target/110792: Early clobber issues with rot32di2_doubleword.

2023-08-02 Thread Uros Bizjak via Gcc-patches
On Thu, Aug 3, 2023 at 12:18 AM Roger Sayle wrote: > > > This patch is a conservative fix for PR target/110792, a wrong-code > regression affecting doubleword rotations by BITS_PER_WORD, which > effectively swaps the highpart and lowpart words, when the source to be > rotated resides in memory. Th

Re: [PATCH] Optimize vlddqu + inserti128 to vbroadcasti128

2023-08-01 Thread Uros Bizjak via Gcc-patches
On Wed, Aug 2, 2023 at 3:33 AM liuhongt wrote: > > In [1], I propose a patch to generate vmovdqu for all vlddqu intrinsics > after AVX2, it's rejected as > > The instruction is reachable only as __builtin_ia32_lddqu* (aka > > _mm_lddqu_si*), so it was chosen by the programmer for a reason. I > > t

Re: [RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]

2023-07-31 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 31, 2023 at 11:40 AM Richard Biener wrote: > > On Sun, 30 Jul 2023, Uros Bizjak wrote: > > > Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF > > named patterns in order to avoid generation of partial vector V4SFmode > > trapping instructions. > > > > The new option

[RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832]

2023-07-30 Thread Uros Bizjak via Gcc-patches
Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF named patterns in order to avoid generation of partial vector V4SFmode trapping instructions. The new option is enabled by default, because even with sanitization, a small but consistent speed up of 2 to 3% with Polyhedron capaci

[committed] testsuite: Fix gfortran.dg/ieee/comparisons_3.F90 testsuite failures

2023-07-26 Thread Uros Bizjak via Gcc-patches
The testcase should use dg-additional-options instead of dg-options to not overwrite default compile flags that include path for finding the IEEE modules. gcc/testsuite/ChangeLog: * gfortran.dg/ieee/comparisons_3.F90: Use dg-additional-options instead of dg-options. Tested on x86_64-linu

[committed] i386: Clear upper half of XMM register for V2SFmode operations [PR110762]

2023-07-26 Thread Uros Bizjak via Gcc-patches
Clear the upper half of a V4SFmode operand register in front of all potentially trapping instructions. The testcase: --cut here-- typedef float v2sf __attribute__((vector_size(8))); typedef float v4sf __attribute__((vector_size(16))); v2sf test(v4sf x, v4sf y) { v2sf x2, y2; x2 = __builtin_s

Re: [x86 PATCH] Don't use insvti_{high, low}part with -O0 (for compile-time).

2023-07-22 Thread Uros Bizjak via Gcc-patches
On Sat, Jul 22, 2023 at 4:17 PM Roger Sayle wrote: > > > This patch attempts to help with PR rtl-optimization/110587, a regression > of -O0 compile time for the pathological pr28071.c. My recent patch helps > a bit, but hasn't returned -O0 compile-time to where it was before my > ix86_expand_move

Re: [x86 PATCH] Use QImode for offsets in zero_extract/sign_extract in i386.md

2023-07-22 Thread Uros Bizjak via Gcc-patches
On Sat, Jul 22, 2023 at 5:37 PM Roger Sayle wrote: > > > As suggested by Uros, this patch changes the ZERO_EXTRACTs and SIGN_EXTRACTs > in i386.md to consistently use QImode for bit offsets (i.e. third and fourth > operands), matching the use of QImode for bit counts in shifts and rotates. > > The

[committed] i386: Double-word sign-extension missed-optimization [PR110717]

2023-07-20 Thread Uros Bizjak via Gcc-patches
When sign-extending the value in a double-word register pair using shift and ashiftrt sequence with the same count immediate value less than word width, there is no need to shift the lower word of the value. The sign-extension could be limited to the upper word, but we uselessly shift the lower wor

Re: [PATCH] Optimize vlddqu to vmovdqu for TARGET_AVX

2023-07-20 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 20, 2023 at 9:35 AM liuhongt wrote: > > For Intel processors, after TARGET_AVX, vmovdqu is optimized as fast > as vlddqu, UNSPEC_LDDQU can be removed to enable more optimizations. > Can someone confirm this with AMD folks? > If AMD doesn't like such optimization, I'll put my optimizati

Re: [x86_64 PATCH] More TImode parameter passing improvements.

2023-07-20 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 20, 2023 at 9:44 AM Roger Sayle wrote: > > > Hi Uros, > > > From: Uros Bizjak > > Sent: 20 July 2023 07:50 > > > > On Wed, Jul 19, 2023 at 10:07 PM Roger Sayle > > wrote: > > > > > > This patch is the next piece of a solution to the x86_64 ABI issues in > > > PR 88873. This splits t

Re: [x86_64 PATCH] More TImode parameter passing improvements.

2023-07-19 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 19, 2023 at 10:07 PM Roger Sayle wrote: > > > This patch is the next piece of a solution to the x86_64 ABI issues in > PR 88873. This splits the *concat3_3 define_insn_and_split > into two patterns, a TARGET_64BIT *concatditi3_3 and a !TARGET_64BIT > *concatsidi3_3. This allows us to

Re: [GCC 13 PATCH] PR target/109973: CCZmode and CCCmode variants of [v]ptest.

2023-07-19 Thread Uros Bizjak via Gcc-patches
On Wed, Jul 19, 2023 at 2:21 PM Richard Biener wrote: > > On Sun, Jun 11, 2023 at 12:55 AM Roger Sayle > wrote: > > > > > > This is a backport of the fixes for PR target/109973 and PR target/110083. > > > > This backport to the releases/gcc-13 branch has been tested on > > x86_64-pc-linux-gnu wi

[committed] dwarf2: Change return type of predicate functions from int to bool

2023-07-18 Thread Uros Bizjak via Gcc-patches
Also change some internal variables and function arguments from int to bool. gcc/ChangeLog: * dwarf2asm.cc: Change FALSE to false. * dwarf2cfi.cc (execute_dwarf2_frame): Change return type to void. * dwarf2out.cc (matches_main_base): Change return type from int to bool. Change "l

[committed] combine: Change return type of predicate functions from int to bool

2023-07-17 Thread Uros Bizjak via Gcc-patches
Also change some internal variables and function arguments from int to bool. gcc/ChangeLog: * combine.cc (struct reg_stat_type): Change last_set_invalid to bool. (cant_combine_insn_p): Change return type from int to bool and adjust function body accordingly. (can_combine_p): Ditto

Re: [PATCH 1/2] [i386] Support type _Float16/__bf16 independent of SSE2.

2023-07-17 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 17, 2023 at 10:28 AM Hongtao Liu wrote: > > I'd like to ping for this patch (only patch 1/2, for patch 2/2, I > think that may not be necessary). > > On Mon, May 15, 2023 at 9:20 AM Hongtao Liu wrote: > > > > ping. > > > > On Fri, Apr 21, 2023 at 9:55 PM liuhongt wrote: > > > > > > >

Re: [PATCH] Add peephole to eliminate redundant comparison after cmpccxadd.

2023-07-17 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 17, 2023 at 8:44 AM Hongtao Liu wrote: > > Ping. > > On Tue, Jul 11, 2023 at 5:16 PM liuhongt via Gcc-patches > wrote: > > > > Similar like we did for CMPXCHG, but extended to all > > ix86_comparison_int_operator since CMPCCXADD set EFLAGS exactly same > > as CMP. > > > > When operand

Re: [PATCH] x86: replace "extendhfdf2" expander

2023-07-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 14, 2023 at 11:44 AM Jan Beulich wrote: > > The corresponding insn serves this purpose quite fine, and leads to > slightly less (generated) code. All we need is the insn to not have a > leading * in its name, while retaining that * for "extendhfsf2". > Introduce a mode attribute in exc

Re: [x86 PATCH] PR target/110588: Add *bt_setncqi_2 to generate btl

2023-07-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 14, 2023 at 11:27 AM Roger Sayle wrote: > > > > From: Uros Bizjak > > Sent: 13 July 2023 19:21 > > > > On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle > > wrote: > > > > > > This patch resolves PR target/110588 to catch another case in combine > > > where the i386 backend should be gener

Re: [PATCH] cprop: Do not set REG_EQUAL note when simplifying paradoxical subreg [PR110206]

2023-07-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 14, 2023 at 10:53 AM Richard Biener wrote: > > On Fri, 14 Jul 2023, Uros Bizjak wrote: > > > On Fri, Jul 14, 2023 at 10:31?AM Richard Biener wrote: > > > > > > On Fri, 14 Jul 2023, Uros Bizjak wrote: > > > > > > > cprop1 pass does not consider paradoxical subreg and for (insn 22) > >

Re: [PATCH] cprop: Do not set REG_EQUAL note when simplifying paradoxical subreg [PR110206]

2023-07-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 14, 2023 at 10:31 AM Richard Biener wrote: > > On Fri, 14 Jul 2023, Uros Bizjak wrote: > > > cprop1 pass does not consider paradoxical subreg and for (insn 22) claims > > that it equals 8 elements of HImodeby setting REG_EQUAL note: > > > > (insn 21 19 22 4 (set (reg:V4QI 98) > >

Re: [x86_64 PATCH] Improved insv of DImode/DFmode {high,low}parts into TImode.

2023-07-14 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 13, 2023 at 6:45 PM Roger Sayle wrote: > > > This is the next piece towards a fix for (the x86_64 ABI issues affecting) > PR 88873. This patch generalizes the recent tweak to ix86_expand_move > for setting the highpart of a TImode reg from a DImode source using > *insvti_highpart_1, t

Re: [PATCH] i386: Auto vectorize usdot_prod, udot_prod with AVXVNNIINT16 instruction.

2023-07-14 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 14, 2023 at 8:24 AM Haochen Jiang wrote: > > Hi all, > > This patch aims to auto vectorize usdot_prod and udot_prod with newly > introduced AVX-VNNI-INT16. > > Also I refined the redundant mode iterator in the patch. > > Regtested on x86_64-pc-linux-gnu. Ok for trunk after AVX-VNNI-INT

[PATCH] cprop: Do not set REG_EQUAL note when simplifying paradoxical subreg [PR110206]

2023-07-13 Thread Uros Bizjak via Gcc-patches
cprop1 pass does not consider paradoxical subreg and for (insn 22) claims that it equals 8 elements of HImodeby setting REG_EQUAL note: (insn 21 19 22 4 (set (reg:V4QI 98) (mem/u/c:V4QI (symbol_ref/u:DI ("*.LC1") [flags 0x2]) [0 S4 A32])) "pr110206.c":12:42 1530 {*movv4qi_internal} (

Re: [x86 PATCH] PR target/110588: Add *bt_setncqi_2 to generate btl

2023-07-13 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 13, 2023 at 7:10 PM Roger Sayle wrote: > > > This patch resolves PR target/110588 to catch another case in combine > where the i386 backend should be generating a btl instruction. This adds > another define_insn_and_split to recognize the RTL representation for this > case. > > I also

[committed] alpha: Fix computation mode in alpha_emit_set_long_cost [PR106966]

2023-07-13 Thread Uros Bizjak via Gcc-patches
PR target/106966 gcc/ChangeLog: * config/alpha/alpha.cc (alpha_emit_set_long_const): Always use DImode when constructing long const. gcc/testsuite/ChangeLog: * gcc.target/alpha/pr106966.c: New test. Bootstrapped and regression tested by Matthias on alpha-linux-gnu. Uros. diff

[committed] IRA+LRA: Change return type of predicate functions from int to bool

2023-07-12 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: * ira.cc (equiv_init_varies_p): Change return type from int to bool and adjust function body accordingly. (equiv_init_movable_p): Ditto. (memref_used_between_p): Ditto. * lra-constraints.cc (valid_address_p): Ditto. Bootstrapped and regression tested on x86_64-l

[committed] ifcvt: Change return type of predicate functions from int to bool

2023-07-12 Thread Uros Bizjak via Gcc-patches
Also change some internal variables and function arguments from int to bool. gcc/ChangeLog: * ifcvt.cc (cond_exec_changed_p): Change variable to bool. (last_active_insn): Change "skip_use_p" function argument to bool. (noce_operand_ok): Change return type from int to bool. (find_c

Re: [PATCH] simplify-rtx: Fix invalid simplification with paradoxical subregs [PR110206]

2023-07-12 Thread Uros Bizjak via Gcc-patches
t;> On Mon, Jul 10, 2023 at 11:47 AM Richard Biener > > >> wrote: > > >> > > > >> > On Mon, Jul 10, 2023 at 11:26 AM Uros Bizjak wrote: > > >> > > > > >> > > On Mon, Jul 10, 2023 at 11:17 AM Richard

Re: [PATCH] simplify-rtx: Fix invalid simplification with paradoxical subregs [PR110206]

2023-07-12 Thread Uros Bizjak via Gcc-patches
t;> > On Mon, Jul 10, 2023 at 11:26 AM Uros Bizjak wrote: > >> > > > >> > > On Mon, Jul 10, 2023 at 11:17 AM Richard Biener > >> > > wrote: > >> > > > > >> > > > On Sun, Jul 9, 2023 at 10:53 AM Uros Bizjak vi

Re: [x86 PATCH] Fix FAIL of gcc.target/i386/pr91681-1.c

2023-07-12 Thread Uros Bizjak via Gcc-patches
On Tue, Jul 11, 2023 at 10:07 PM Roger Sayle wrote: > > > The recent change in TImode parameter passing on x86_64 results in the > FAIL of pr91681-1.c. The issue is that with the extra flexibility, > the combine pass is now spoilt for choice between using either the > *add3_doubleword_concat or t

Re: [x86 PATCH] PR target/110598: Fix rega = 0; rega ^= rega regression.

2023-07-12 Thread Uros Bizjak via Gcc-patches
On Tue, Jul 11, 2023 at 9:07 PM Roger Sayle wrote: > > > This patch fixes the regression PR target/110598 caused by my recent > addition of a peephole2. The intention of that optimization was to > simplify zeroing a register, followed by an IOR, XOR or PLUS operation > on it into a move, or as de

[committed] cfg+gcse: Change return type of predicate functions from int to bool

2023-07-11 Thread Uros Bizjak via Gcc-patches
Also change some internal variables from int to bool. gcc/ChangeLog: * cfghooks.cc (verify_flow_info): Change "err" variable to bool. * cfghooks.h (struct cfg_hooks): Change return type of verify_flow_info from integer to bool. * cfgrtl.cc (can_delete_note_p): Change return type f

[committed] reorg: Change return type of predicate functions from int to bool

2023-07-10 Thread Uros Bizjak via Gcc-patches
Also change some internal variables and function arguments from int to bool. gcc/ChangeLog: * reorg.cc (stop_search_p): Change return type from int to bool and adjust function body accordingly. (resource_conflicts_p): Ditto. (insn_references_resource_p): Change return type from in

Re: [PATCH] simplify-rtx: Fix invalid simplification with paradoxical subregs [PR110206]

2023-07-10 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 10, 2023 at 11:47 AM Richard Biener wrote: > > On Mon, Jul 10, 2023 at 11:26 AM Uros Bizjak wrote: > > > > On Mon, Jul 10, 2023 at 11:17 AM Richard Biener > > wrote: > > > > > > On Sun, Jul 9, 2023 at 10:53 AM Uros Bizjak via Gcc-patches >

Re: [PATCH] simplify-rtx: Fix invalid simplification with paradoxical subregs [PR110206]

2023-07-10 Thread Uros Bizjak via Gcc-patches
On Mon, Jul 10, 2023 at 11:17 AM Richard Biener wrote: > > On Sun, Jul 9, 2023 at 10:53 AM Uros Bizjak via Gcc-patches > wrote: > > > > As shown in the PR, simplify_gen_subreg call in simplify_replace_fn_rtx: > > > > (gdb) list > > 469 if (code =

Re: [X86 PATCH] Add new insvti_lowpart_1 and insvdi_lowpart_1 patterns.

2023-07-09 Thread Uros Bizjak via Gcc-patches
On Sun, Jul 9, 2023 at 11:30 PM Roger Sayle wrote: > > > This patch implements another of Uros' suggestions, to investigate a > insvti_lowpart_1 pattern to improve TImode parameter passing on x86_64. > In PR 88873, the RTL the middle-end expands for passing V2DF in TImode > is subtly different fro

Re: [x86 PATCH] Add AVX512 support for STV of SI/DImode rotation by constant.

2023-07-09 Thread Uros Bizjak via Gcc-patches
On Sun, Jul 9, 2023 at 10:35 PM Roger Sayle wrote: > > > Following Uros' suggestion, this patch adds support for AVX512VL's > vpro[lr][dq] instructions to the recently added scalar-to-vector (STV) > enhancements to handle DImode and SImode rotations by a constant. > > For the test cases: > > unsig

[PATCH] simplify-rtx: Fix invalid simplification with paradoxical subregs [PR110206]

2023-07-09 Thread Uros Bizjak via Gcc-patches
As shown in the PR, simplify_gen_subreg call in simplify_replace_fn_rtx: (gdb) list 469 if (code == SUBREG) 470 { 471 op0 = simplify_replace_fn_rtx (SUBREG_REG (x), old_rtx, fn, data); 472 if (op0 == SUBREG_REG (x)) 473 return x; 47

[committed] cprop: Change return type of predicate functions from int to bool

2023-07-08 Thread Uros Bizjak via Gcc-patches
Also change some internal variables from int to bool. gcc/ChangeLog: * cprop.cc (reg_available_p): Change return type from int to bool. (reg_not_set_p): Ditto. (try_replace_reg): Ditto. Change "success" variable to bool. (cprop_jump): Change return type from int to void and a

[committed] gcse: Change return type of predicate functions from int to bool

2023-07-08 Thread Uros Bizjak via Gcc-patches
Also change some internal variables and function arguments from int to bool. gcc/ChangeLog: * gcse.cc (expr_equiv_p): Change return type from int to bool. (oprs_unchanged_p): Change return type from int to void and adjust function body accordingly. (oprs_anticipatable_p): Ditto.

Re: [PATCH V2] [x86] Add pre_reload splitter to detect fp min/max pattern.

2023-07-06 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 7, 2023 at 7:31 AM liuhongt wrote: > > > Please split the above pattern into two, one emitting UNSPEC_IEEE_MAX > > and the other emitting UNSPEC_IEEE_MIN. > Splitted. > > > The test involves blendv instruction, which is SSE4.1, so it is > > pointless to test it without -msse4.1. Please

Re: [x86_64 PATCH] Improve __int128 argument passing (in ix86_expand_move).

2023-07-06 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 6, 2023 at 3:48 PM Roger Sayle wrote: > > > On Thu, Jul 6, 2023 at 2:04 PM Roger Sayle > > wrote: > > > > > > > > > Passing 128-bit integer (TImode) parameters on x86_64 can sometimes > > > result in surprising code. Consider the example below (from PR 43644): > > > > > > __uint128 f

Re: [x86_64 PATCH] Improve __int128 argument passing (in ix86_expand_move).

2023-07-06 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 6, 2023 at 2:04 PM Roger Sayle wrote: > > > Passing 128-bit integer (TImode) parameters on x86_64 can sometimes > result in surprising code. Consider the example below (from PR 43644): > > __uint128 foo(__uint128 x, unsigned long long y) { > return x+y; > } > > which currently resul

Re: [PATCH] i386: Update document for inlining rules

2023-07-06 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 6, 2023 at 8:39 AM Hongyu Wang wrote: > > Hi, > > This is a follow-up patch for > https://gcc.gnu.org/pipermail/gcc-patches/2023-July/623525.html > that updates document about x86 inlining rules. > > Ok for trunk? > > gcc/ChangeLog: > > * doc/extend.texi: Move x86 inlining rule

Re: [PATCH 1/2] [x86] Add pre_reload splitter to detect fp min/max pattern.

2023-07-05 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 6, 2023 at 3:20 AM liuhongt wrote: > > We have ix86_expand_sse_fp_minmax to detect min/max sematics, but > it requires rtx_equal_p for cmp_op0/cmp_op1 and if_true/if_false, for > the testcase in the PR, there's an extra move from cmp_op0 to if_true, > and it failed ix86_expand_sse_fp_m

Re: [PATCH 2/2] Adjust rtx_cost for DF/SFmode AND/IOR/XOR/ANDN operations.

2023-07-05 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 6, 2023 at 3:20 AM liuhongt wrote: > > They should have same cost as vector mode since both generate > pand/pandn/pxor/por instruction. > > Bootstrapped and regtested on x86_64-pc-linu-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > * config/i386/i386.cc (ix86_rtx_costs): Ad

Re: [PATCH] Disparage slightly for the alternative which move DFmode between SSE_REGS and GENERAL_REGS.

2023-07-05 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 6, 2023 at 3:14 AM liuhongt wrote: > > For testcase > > void __cond_swap(double* __x, double* __y) { > bool __r = (*__x < *__y); > auto __tmp = __r ? *__x : *__y; > *__y = __r ? *__y : *__x; > *__x = __tmp; > } > > GCC-14 with -O2 and -march=x86-64 options generates the followi

[committed] sched: Change return type of predicate functions from int to bool

2023-07-05 Thread Uros Bizjak via Gcc-patches
Also change some internal variables to bool. gcc/ChangeLog: * sched-int.h (struct haifa_sched_info): Change can_schedule_ready_p, scehdule_more_p and contributes_to_priority indirect frunction type from int to bool. (no_real_insns_p): Change return type from int to bool. (cont

Re: [PATCH V2] i386: Inline function with default arch/tune to caller

2023-07-04 Thread Uros Bizjak via Gcc-patches
le description to the new subsubsection? > > > Looking at the above, perhaps inlining of different arches can also be > > forced with always_inline? This would allow developers some control of > > inlining, and would not be surprising. > > If so, I'd like to add the a

Re: [PATCH V2] i386: Inline function with default arch/tune to caller

2023-07-03 Thread Uros Bizjak via Gcc-patches
On Tue, Jul 4, 2023 at 5:12 AM Hongyu Wang wrote: > > Hi, > > For function with different target attributes, current logic rejects to > inline the callee when any arch or tune is mismatched. Relax the > condition to allow callee with default arch/tune to be inlined. > > Boostrapped/regtested on x8

[committed] tree+ggc: Change return type of predicate functions from int to bool

2023-07-03 Thread Uros Bizjak via Gcc-patches
Also change internal variable from int to bool. gcc/ChangeLog: * tree.h (tree_int_cst_equal): Change return type from int to bool. (operand_equal_for_phi_arg_p): Ditto. (tree_map_base_marked_p): Ditto. * tree.cc (contains_placeholder_p): Update function body for bool return ty

[committed] fold-const+optabs: Change return type of predicate functions from int to bool

2023-06-30 Thread Uros Bizjak via Gcc-patches
Also change some internal variables and function argument from int to bool. gcc/ChangeLog: * fold-const.h (multiple_of_p): Change return type from int to bool. * fold-const.cc (split_tree): Change negl_p, neg_litp_p, neg_conp_p and neg_var_p variables to bool. (const_binop): Chang

Re: [x86 PATCH] Add STV support for DImode and SImode rotations by constant.

2023-06-30 Thread Uros Bizjak via Gcc-patches
On Fri, Jun 30, 2023 at 9:29 AM Roger Sayle wrote: > > > This patch implements scalar-to-vector (STV) support for DImode and SImode > rotations by constant bit counts. Scalar rotations are almost always > optimal on x86, requiring only one or two instructions, but it is also > possible to impleme

[committed] cselib+expr+bitmap: Change return type of predicate functions from int to bool

2023-06-29 Thread Uros Bizjak via Gcc-patches
gcc/ChangeLog: * cselib.h (rtx_equal_for_cselib_1): Change return type from int to bool. (references_value_p): Ditto. (rtx_equal_for_cselib_p): Ditto. * expr.h (can_store_by_pieces): Ditto. (try_casesi): Ditto. (try_tablejump): Ditto. (safe_from_p): Ditto. * sbi

[committed] final+varasm: Change return type of predicate functions from int to bool

2023-06-28 Thread Uros Bizjak via Gcc-patches
Also change some internal variables to bool and change return type of compute_alignments to void. gcc/ChangeLog: * output.h (leaf_function_p): Change return type from int to bool. (final_forward_branch_p): Ditto. (only_leaf_regs_used): Ditto. (maybe_assemble_visibility): Ditto.

Re: [PATCH] i386: Relax inline requirement for functions with different target attrs

2023-06-28 Thread Uros Bizjak via Gcc-patches
On the x86, the inliner does not inline a function that has different > > target options than the caller, unless the callee has a subset of the > > target options of the caller. For example a function declared with > > target("sse3") can inline a function with target(&q

Re: [PATCH] i386: Relax inline requirement for functions with different target attrs

2023-06-27 Thread Uros Bizjak via Gcc-patches
se3 implies -msse2. --/q-- I don't think arch=skylake can be considered as a subset of arch=icelake-server. I agree that the compiler should reject functions with different PVW. This is also in accordance with the documentation. Uros. > > Uros Bizjak via Gcc-patches 于2023年6月27日周二 17:1

Re: [x86 PATCH] Add cbranchti4 pattern to i386.md (for -m32 compare_by_pieces).

2023-06-27 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 27, 2023 at 7:22 PM Roger Sayle wrote: > > > This patch fixes some very odd (unanticipated) code generation by > compare_by_pieces with -m32 -mavx, since the recent addition of the > cbranchoi4 pattern. The issue is that cbranchoi4 is available with > TARGET_AVX, but cbranchti4 is cur

Re: [x86 PATCH] Fix FAIL of gcc.target/i386/pr78794.c on ia32.

2023-06-27 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 27, 2023 at 8:40 PM Roger Sayle wrote: > > > This patch fixes the FAIL of gcc.target/i386/pr78794.c on ia32, which > is caused by minor STV rtx_cost differences with -march=silvermont. > It turns out that generic tuning results in pandn, but the lack of > accurate parameterization for

Re: [PATCH] i386: Relax inline requirement for functions with different target attrs

2023-06-27 Thread Uros Bizjak via Gcc-patches
On Mon, Jun 26, 2023 at 4:36 AM Hongyu Wang wrote: > > Hi, > > For function with different target attributes, current logic rejects to > inline the callee when any arch or tune is mismatched. Relax the > condition to honor just prefer_vecotr_width_type and other flags that > may cause safety issue

Re: [PATCH 2/2] Make option mvzeroupper independent of optimization level.

2023-06-26 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 27, 2023 at 8:09 AM Hongtao Liu wrote: > > On Tue, Jun 27, 2023 at 2:05 PM Uros Bizjak wrote: > > > > On Tue, Jun 27, 2023 at 7:55 AM liuhongt wrote: > > > > > > pass_insert_vzeroupper is under condition > > > > > > TARGET_AVX && TARGET_VZEROUPPER > > > && flag_expensive_optimization

Re: [PATCH 1/2] Don't issue vzeroupper for vzeroupper call_insn.

2023-06-26 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 27, 2023 at 8:08 AM Hongtao Liu wrote: > > On Tue, Jun 27, 2023 at 2:05 PM Uros Bizjak wrote: > > > > On Tue, Jun 27, 2023 at 7:55 AM liuhongt wrote: > > > > > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > > > Ok for trunk? > > > > > > gcc/ChangeLog: > > > > > >

Re: [PATCH 2/2] Make option mvzeroupper independent of optimization level.

2023-06-26 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 27, 2023 at 7:55 AM liuhongt wrote: > > pass_insert_vzeroupper is under condition > > TARGET_AVX && TARGET_VZEROUPPER > && flag_expensive_optimizations && !optimize_size > > But the document of mvzeroupper doesn't mention the insertion > required -O2 and above, it may confuse users whe

Re: [PATCH 1/2] Don't issue vzeroupper for vzeroupper call_insn.

2023-06-26 Thread Uros Bizjak via Gcc-patches
On Tue, Jun 27, 2023 at 7:55 AM liuhongt wrote: > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok for trunk? > > gcc/ChangeLog: > > PR target/82735 > * config/i386/i386.cc (ix86_avx_u127_mode_needed): Don't emit > vzeroupper for vzeroupper call_insn. > > gc

Re: [PATCH] i386: Sync tune_string with arch_string for target attribute arch=*

2023-06-25 Thread Uros Bizjak via Gcc-patches
On Mon, Jun 26, 2023 at 4:31 AM Hongyu Wang wrote: > > Hi, > > For function with target attribute arch=*, current logic will set its > tune to -mtune from command line so all target_clones will get same > tuning flags which would affect the performance for each clone. Override > tune with arch if

Re: [x86_PATCH] New *ashl_doubleword_highpart define_insn_and_split.

2023-06-25 Thread Uros Bizjak via Gcc-patches
On Sat, Jun 24, 2023 at 8:04 PM Roger Sayle wrote: > > > This patch contains a pair of (related) optimizations in i386.md that > allow us to generate better code for the example below (this is a step > towards fixing a bugzilla PR, but I've forgotten the number). > > __int128 foo64(__int128 x, lon

[committed] function: Change return type of predicate function from int to bool

2023-06-21 Thread Uros Bizjak via Gcc-patches
Also change some internal variables to bool and some functions to void. gcc/ChangeLog: * function.h (emit_initial_value_sets): Change return type from int to void. (aggregate_value_p): Change return type from int to bool. (prologue_contains): Ditto. (epilogue_contains): Ditto.

  1   2   3   4   5   6   7   8   9   10   >