[PATCH 37/62] AVX512FP16: Add vcvtsh2ss/vcvtsh2sd/vcvtss2sh/vcvtsd2sh.

2021-07-01 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm_cvtsh_ss): New intrinsic. (_mm_mask_cvtsh_ss): Likewise. (_mm_maskz_cvtsh_ss): Likewise. (_mm_cvtsh_sd): Likewise. (_mm_mask_cvtsh_sd): Likewise. (_mm_maskz_cvtsh_sd): Likewise. (_m

[PATCH 40/62] AVX512FP16: Add vfmaddsub[132, 213, 231]ph/vfmsubadd[132, 213, 231]ph.

2021-07-01 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm512_fmaddsub_ph): New intrinsic. (_mm512_mask_fmaddsub_ph): Likewise. (_mm512_mask3_fmaddsub_ph): Likewise. (_mm512_maskz_fmaddsub_ph): Likewise. (_mm512_fmaddsub_round_ph): Likewise. (_mm51

[PATCH 39/62] AVX512FP16: Add intrinsics for casting between vector float16 and vector float32/float64/integer.

2021-07-01 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm_undefined_ph): New intrinsic. (_mm256_undefined_ph): Likewise. (_mm512_undefined_ph): Likewise. (_mm_cvtsh_h): Likewise. (_mm256_cvtsh_h): Likewise. (_mm512_cvtsh_h): Likewise. (_mm

[PATCH 41/62] AVX512FP16: Add testcase for vfmaddsub[132, 213, 231]ph/vfmsubadd[132, 213, 231]ph.

2021-07-01 Thread liuhongt via Gcc-patches
gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vfmaddsubXXXph-1a.c: New test. * gcc.target/i386/avx512fp16-vfmaddsubXXXph-1b.c: Ditto. * gcc.target/i386/avx512fp16-vfmsubaddXXXph-1a.c: Ditto. * gcc.target/i386/avx512fp16-vfmsubaddXXXph-1b.c: Ditto. *

[PATCH 42/62] AVX512FP16: Add FP16 fma instructions.

2021-07-01 Thread liuhongt via Gcc-patches
Add vfmadd[132,213,231]ph/vfnmadd[132,213,231]ph/vfmsub[132,213,231]ph/ vfnmsub[132,213,231]ph. gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm512_mask_fmadd_ph): New intrinsic. (_mm512_mask3_fmadd_ph): Likewise. (_mm512_maskz_fmadd_ph): Likewise. (_mm

[PATCH 43/62] AVX512FP16: Add testcase for fma instructions

2021-07-01 Thread liuhongt via Gcc-patches
gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vfmaddXXXph-1a.c: New test. * gcc.target/i386/avx512fp16-vfmaddXXXph-1b.c: Ditto. * gcc.target/i386/avx512fp16-vfmsubXXXph-1a.c: Ditto. * gcc.target/i386/avx512fp16-vfmsubXXXph-1b.c: Ditto. * gcc.target/i

[PATCH 44/62] AVX512FP16: Add scalar/vector bitwise operations, including

2021-07-01 Thread liuhongt via Gcc-patches
From: "H.J. Lu" 1. FP16 vector xor/ior/and/andnot/abs/neg 2. FP16 scalar abs/neg/copysign/xorsign gcc/ChangeLog: * config/i386/i386-expand.c (ix86_expand_fp_absneg_operator): Handle HFmode. (ix86_expand_copysign): Ditto. (ix86_expand_xorsign): Ditto. * co

[PATCH 45/62] AVX512FP16: Add testcase for fp16 bitwise operations.

2021-07-01 Thread liuhongt via Gcc-patches
gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-neg-1a.c: New test. * gcc.target/i386/avx512fp16-neg-1b.c: Ditto. * gcc.target/i386/avx512fp16-scalar-bitwise-1a.c: Ditto. * gcc.target/i386/avx512fp16-scalar-bitwise-1b.c: Ditto. * gcc.target/i386/avx512

[PATCH 46/62] AVX512FP16: Enable FP16 mask load/store.

2021-07-01 Thread liuhongt via Gcc-patches
From: "H.J. Lu" gcc/ChangeLog: * config/i386/sse.md (avx512fmaskmodelower): Extend to support HF modes. (maskload): Ditto. (maskstore): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-xorsign-1.c: New test. --- gcc/config/i386/sse.md

[PATCH 47/62] AVX512FP16: Add scalar fma instructions.

2021-07-01 Thread liuhongt via Gcc-patches
Add vfmadd[132,213,231]sh/vfnmadd[132,213,231]sh/ vfmsub[132,213,231]sh/vfnmsub[132,213,231]sh. gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm_fmadd_sh): New intrinsic. (_mm_mask_fmadd_sh): Likewise. (_mm_mask3_fmadd_sh): Likewise. (_mm_maskz_fmadd_sh

[PATCH 48/62] AVX512FP16: Add testcase for scalar FMA instructions.

2021-07-01 Thread liuhongt via Gcc-patches
gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vfmaddXXXsh-1a.c: New test. * gcc.target/i386/avx512fp16-vfmaddXXXsh-1b.c: Ditto. * gcc.target/i386/avx512fp16-vfmsubXXXsh-1a.c: Ditto. * gcc.target/i386/avx512fp16-vfmsubXXXsh-1b.c: Ditto. * gcc.target/i

[PATCH 50/62] AVX512FP16: Add testcases for vfcmaddcph/vfmaddcph/vfcmulcph/vfmulcph.

2021-07-01 Thread liuhongt via Gcc-patches
gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-helper.h (init_src): Adjust init value. (NET_CMASK): New net mask for complex input. * gcc.target/i386/avx512fp16-vfcmaddcph-1a.c: New test. * gcc.target/i386/avx512fp16-vfcmaddcph-1b.c: Ditto. *

[PATCH 49/62] AVX512FP16: Add vfcmaddcph/vfmaddcph/vfcmulcph/vfmulcph

2021-07-01 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm512_fcmadd_pch): New intrinsic. (_mm512_mask_fcmadd_pch): Likewise. (_mm512_mask3_fcmadd_pch): Likewise. (_mm512_maskz_fcmadd_pch): Likewise. (_mm512_fmadd_pch): Likewise. (_mm512_mask_fmadd

[PATCH 53/62] AVX512FP16: Add expander for sqrthf2.

2021-07-01 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/i386-features.c (i386-features.c): Handle E_HFmode. * config/i386/i386.md (sqrthf2): New expander. (*sqrt2_sse): Extend to MODEFH. * config/i386/sse.md (*_vmsqrt2): Extend to VFH_128. gcc/testsuite/ChangeLog:

[PATCH 51/62] AVX512FP16: Add vfcmaddcsh/vfmaddcsh/vfcmulcsh/vfmulcsh.

2021-07-01 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm_mask_fcmadd_sch): New intrinsic. (_mm_mask3_fcmadd_sch): Likewise. (_mm_maskz_fcmadd_sch): Likewise. (_mm_fcmadd_sch): Likewise. (_mm_mask_fmadd_sch): Likewise. (_mm_mask3_fmadd_sch): Likew

[PATCH 52/62] AVX512FP16: Add testcases for vfcmaddcsh/vfmaddcsh/vfcmulcsh/vfmulcsh.

2021-07-01 Thread liuhongt via Gcc-patches
gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-vfcmaddcsh-1a.c: New test. * gcc.target/i386/avx512fp16-vfcmaddcsh-1b.c: Ditto. * gcc.target/i386/avx512fp16-vfcmulcsh-1a.c: Ditto. * gcc.target/i386/avx512fp16-vfcmulcsh-1b.c: Ditto. * gcc.target/i386/av

[PATCH 54/62] AVX512FP16: Add expander for ceil/floor/trunc/roundeven.

2021-07-01 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/i386.md (hf2): New expander. (sse4_1_round2): Extend from MODEF to MODEFH. * config/i386/sse.md (*sse4_1_round): Extend from VF_128 to VFH_128. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-builtin-round-1.c: New test.

[PATCH 55/62] AVX512FP16: Add expander for cstorehf4.

2021-07-01 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/i386.md (cstore4): Extend from MODEF to MODEFH. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-builtin-fpcompare-1.c: New test. * gcc.target/i386/avx512fp16-builtin-fpcompare-2.c: New test. --- gcc/config/i386/i386.md

[PATCH 56/62] AVX512FP16: Optimize (_Float16) sqrtf ((float) f16) to sqrtf16 (f16).

2021-07-01 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/i386.md (*sqrthf2): New define_insn. * config/i386/sse.md (*avx512fp16_vmsqrthf2): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-builtin-sqrt-2.c: New test. --- gcc/config/i386/i386.md| 1

[PATCH 57/62] AVX512FP16: Add expander for fmahf4

2021-07-01 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/sse.md (FMAMODEM): extend to handle FP16. (VFH_SF_AVX512VL): Extend to handle HFmode. (VF_SF_AVX512VL): Deleted. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-fma-1.c: New test. * gcc.target/i386/avx512fp16vl-fma-1.c: N

[PATCH 58/62] AVX512FP16: Optimize for code like (_Float16) __builtin_ceif ((float) f16).

2021-07-01 Thread liuhongt via Gcc-patches
gcc/ChangeLog: * config/i386/i386.md (*avx512fp16_1_roundhf2): New define_insn. * config/i386/sse.md (*avx512fp16_1_roundhf): New fine_insn. gcc/testsuite/ChangeLog: * gcc.target/i386/avx512fp16-builtin-round-2.c: New test. --- gcc/config/i386/i386.md

[PATCH 59/62] AVX512FP16: Support load/store/abs intrinsics.

2021-07-01 Thread liuhongt via Gcc-patches
From: dianhong xu gcc/ChangeLog: * config/i386/avx512fp16intrin.h (__m512h_u, __m256h_u, __m128h_u): New typedef. (_mm512_load_ph): New intrinsic. (_mm256_load_ph): Ditto. (_mm_load_ph): Ditto. (_mm512_loadu_ph): Ditto. (_mm256_loadu_ph): D

[PATCH 61/62] AVX512FP16: Add complex conjugation intrinsic instructions.

2021-07-01 Thread liuhongt via Gcc-patches
From: dianhong xu gcc/ChangeLog: * config/i386/avx512fp16intrin.h: Add new intrinsics. (_mm512_conj_pch): New intrinsic. (_mm512_mask_conj_pch): Ditto. (_mm512_maskz_conj_pch): Ditto. * config/i386/avx512fp16vlintrin.h: Add new intrinsics. (_mm256_

[PATCH 62/62] AVX512FP16: Add permutation and mask blend intrinsics.

2021-07-01 Thread liuhongt via Gcc-patches
From: dianhong xu gcc/ChangeLog: * config/i386/avx512fp16intrin.h: (_mm512_mask_blend_ph): New intrinsic. (_mm512_permutex2var_ph): Ditto. (_mm512_permutexvar_ph): Ditto. * config/i386/avx512fp16vlintrin.h: (_mm256_mask_blend_ph): New intrinsic.

[PATCH 60/62] AVX512FP16: Add reduce operators(add/mul/min/max).

2021-07-01 Thread liuhongt via Gcc-patches
From: dianhong xu gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_MM512_REDUCE_OP): New macro (_mm512_reduce_add_ph): New intrinsic. (_mm512_reduce_mul_ph): Ditto. (_mm512_reduce_min_ph): Ditto. (_mm512_reduce_max_ph): Ditto. * config/i386/avx512

Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-07-01 Thread Richard Biener via Gcc-patches
On Wed, Jun 30, 2021 at 9:15 PM Qing Zhao via Gcc-patches wrote: > > > > > On Jun 30, 2021, at 1:59 PM, Richard Biener wrote: > > > > On June 30, 2021 8:07:43 PM GMT+02:00, Qing Zhao > > wrote: > >> > >> > >>> On Jun 30, 2021, at 12:36 PM, Richard Biener > >> wrote: > >>> > >>> On June 30, 202

RE: [PATCH][RFC]AArch64 SVE: Fix multiple comparison masks on inverted operands

2021-07-01 Thread Tamar Christina via Gcc-patches
> -Original Message- > From: Richard Sandiford > Sent: Wednesday, June 30, 2021 6:55 PM > To: Tamar Christina > Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw > ; Marcus Shawcroft > ; Kyrylo Tkachov > Subject: Re: [PATCH][RFC]AArch64 SVE: Fix multiple comparison masks on > inverted

[committed] openmp - Fix up && and || reductions [PR94366]

2021-07-01 Thread Jakub Jelinek via Gcc-patches
Hi! As the testcase shows, the special treatment of && and || reduction combiners where we expand them as omp_out = (omp_out != 0) && (omp_in != 0) (or with ||) is not needed just for &&/|| on floating point or complex types, but for all &&/|| reductions - when expanded as omp_out = omp_out && omp

Re: [PATCH] Analyze niter for until-wrap condition [PR101145]

2021-07-01 Thread Bin.Cheng via Gcc-patches
On Thu, Jul 1, 2021 at 10:06 AM Jiufu Guo via Gcc-patches wrote: > > For code like: > unsigned foo(unsigned val, unsigned start) > { > unsigned cnt = 0; > for (unsigned i = start; i > val; ++i) > cnt++; > return cnt; > } > > The number of iterations should be about UINT_MAX - start. > >

[PATCH] dwarf2out: Handle COMPOUND_LITERAL_EXPR in loc_list_from_tree_1 [PR101266]

2021-07-01 Thread Jakub Jelinek via Gcc-patches
Hi! In this case dwarf2out_decl is called from the FEs with GENERIC but not yet gimplified expressions in it. As loc_list_from_tree_1 has an exhaustive list of tree codes it wants to handle and for checking asserts no other codes makes it in, we should handle even GENERIC trees that shouldn't be

Re: [PATCH, Fortran] set version field in CFI_cdesc_t to CFI_VERSION

2021-07-01 Thread Tobias Burnus
On 01.07.21 08:00, Sandra Loosemore wrote: This patch fixes the failures in interoperability/fc-descriptor-8.f90 in my just-posted TS 29113 testsuite: https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574115.html The problem here is that the routines that copy between GFC and CFI descriptors t

Re: [PATCH] dwarf2out: Handle COMPOUND_LITERAL_EXPR in loc_list_from_tree_1 [PR101266]

2021-07-01 Thread Richard Biener
On Thu, 1 Jul 2021, Jakub Jelinek wrote: > Hi! > > In this case dwarf2out_decl is called from the FEs with GENERIC but not > yet gimplified expressions in it. > > As loc_list_from_tree_1 has an exhaustive list of tree codes it wants to > handle and for checking asserts no other codes makes it in

[PATCH] Fix typo in standard pattern name of trunc2.

2021-07-01 Thread liuhongt via Gcc-patches
Bootstrapped and regtested on x86_64-linux-gnu{-m32,}. Pushed to trunk as abvious fix. gcc/ChangeLog * config/i386/sse.md (trunc2): Refined to .. (trunc2): this. --- gcc/config/i386/sse.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/

[PATCH] tree-optimization/101178 - handle VEC_PERM in SLP permute propagation

2021-07-01 Thread Richard Biener
This adds handling of VEC_PERM nodes to SLP permute propagation. Previously VEC_PERM acted as forced materialization of incoming permutes since it is a good place to do that (with the constraint of those only appearing for two-operator nodes). The following patch, in addition to supporting (but no

[PATCH] [i386] Clear odata for aes(enc|dec)(wide)?kl intrinsics

2021-07-01 Thread Hongyu Wang via Gcc-patches
For Keylocker aesenc/aesdec intrinsics, current implementation moves idata to odata unconditionally, which causes safety issue when the instruction meets runtime error. So we add a branch to clear odata when ZF is set after instruction exectution. gcc/ChangeLog: * config/i386/i386-expand.

Re: [PATCH 3/4] remove %K from error() calls in the aarch64/arm back ends (PR 98512)

2021-07-01 Thread Christophe LYON via Gcc-patches
On 30/06/2021 21:56, Martin Sebor via Gcc-patches wrote: On 6/11/21 8:46 AM, Martin Sebor wrote: On 6/11/21 3:58 AM, Richard Sandiford wrote: Martin Sebor via Gcc-patches writes: diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 7b37e1b602c..7cdc824730c 100644 --- a/gcc/config/

Re: [PATCH] [i386] Clear odata for aes(enc|dec)(wide)?kl intrinsics

2021-07-01 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 1, 2021 at 3:51 PM Hongyu Wang wrote: > > For Keylocker aesenc/aesdec intrinsics, current implementation > moves idata to odata unconditionally, which causes safety issue when > the instruction meets runtime error. So we add a branch to clear > odata when ZF is set after instruction ex

Re: [PATCH] mips: add MSA vec_cmp and vec_cmpu expand pattern [PR101132]

2021-07-01 Thread Xi Ruoyao via Gcc-patches
Ping. On Mon, 2021-06-21 at 21:42 +0800, Xi Ruoyao wrote: > Middle-end started to emit vec_cmp and vec_cmpu since GCC 11, causing > ICE on MIPS with MSA enabled.  Add the pattern to prevent it. > > Bootstrapped and regression tested on mips64el-linux-gnu. > Ok for trunk? > > gcc/ > > *

Re: [PATCH] [i386] Clear odata for aes(enc|dec)(wide)?kl intrinsics

2021-07-01 Thread Hongyu Wang via Gcc-patches
> Change some keylocker insn to Keylocker aesenc/aesdec in comments. > others LGTM. Changed. Forgot to mention bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Attached file is the patch i'm going to check-in. Hongtao Liu via Gcc-patches 于2021年7月1日周四 下午4:07写道: > > On Thu, Jul 1,

Re: [PATCH v6 2/2] x86: Add vec_duplicate expander

2021-07-01 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 29, 2021 at 6:16 AM H.J. Lu wrote: > > Add vec_duplicate expander for SSE2 if we can move from GPR to SSE > register directly. > > * config/i386/i386-expand.c (ix86_expand_vector_init_duplicate): > Make it global. > * config/i386/i386-protos.h (ix86_expand_vecto

Re: [PATCH v6 1/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-07-01 Thread Hongtao Liu via Gcc-patches
On Tue, Jun 29, 2021 at 6:16 AM H.J. Lu wrote: > > 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTOR > operands to vector broadcast from an integer with AVX. > 2. Add ix86_gen_scratch_sse_rtx to return a scratch SSE register which > won't increase stack alignment requirement

Re: [PATCH][RFC]AArch64 SVE: Fix multiple comparison masks on inverted operands

2021-07-01 Thread Richard Sandiford via Gcc-patches
Tamar Christina writes: >> -Original Message- >> From: Richard Sandiford >> Sent: Wednesday, June 30, 2021 6:55 PM >> To: Tamar Christina >> Cc: gcc-patches@gcc.gnu.org; nd ; Richard Earnshaw >> ; Marcus Shawcroft >> ; Kyrylo Tkachov >> Subject: Re: [PATCH][RFC]AArch64 SVE: Fix multiple

[PATCH] i386: Add integer nabs instructions [PR101044]

2021-07-01 Thread Uros Bizjak via Gcc-patches
The patch adds integer nabs "(NEG (ABS (...)))" instructions, adds STV conversion and adjusts STV cost calculations accordingly. When CMOV instruction is used to implement abs, the sign is determined from the preceding operand negation, and CMOVS is used to select between negated and non-negated v

Re: [PATCH] Add gnu::diagnose_as attribute

2021-07-01 Thread Matthias Kretz
On Tuesday, 22 June 2021 22:12:42 CEST Jason Merrill wrote: > On 6/22/21 4:01 PM, Matthias Kretz wrote: > > On Tuesday, 22 June 2021 21:52:16 CEST Jason Merrill wrote: > >> For alias templates, you probably want the attribute only on the > >> templated class, not on the instantiations. > > > > Oh

Re: [PATCH 56/62] AVX512FP16: Optimize (_Float16) sqrtf ((float) f16) to sqrtf16 (f16).

2021-07-01 Thread Richard Biener via Gcc-patches
On Thu, Jul 1, 2021 at 9:20 AM liuhongt via Gcc-patches wrote: How does this look on GIMPLE and why's it not better handled there? Richard. > gcc/ChangeLog: > > * config/i386/i386.md (*sqrthf2): New define_insn. > * config/i386/sse.md > (*avx512fp16_vmsqrthf2): >

Re: [PATCH 58/62] AVX512FP16: Optimize for code like (_Float16) __builtin_ceif ((float) f16).

2021-07-01 Thread Richard Biener via Gcc-patches
On Thu, Jul 1, 2021 at 9:22 AM liuhongt via Gcc-patches wrote: > > gcc/ChangeLog: Same question. There's maybe no direct optab for ceil but the foldings could emit .CEIL () internal fns based on availability. > * config/i386/i386.md (*avx512fp16_1_roundhf2): New define_insn. > *

Re: [PATCH 2/4] allow poisoning input_location in ranges it should not be used

2021-07-01 Thread Trevor Saunders
On Wed, Jun 30, 2021 at 11:13:23AM -0400, David Malcolm wrote: > On Wed, 2021-06-30 at 01:35 -0400, Trevor Saunders wrote: > > This makes it possible to assert if input_location is used during the > > lifetime > > of a scope.  This will allow us to find places that currently use it > > within a > >

Re: [PATCH 56/62] AVX512FP16: Optimize (_Float16) sqrtf ((float) f16) to sqrtf16 (f16).

2021-07-01 Thread Hongtao Liu via Gcc-patches
On Thu, Jul 1, 2021 at 5:51 PM Richard Biener via Gcc-patches wrote: > > On Thu, Jul 1, 2021 at 9:20 AM liuhongt via Gcc-patches > wrote: > > How does this look on GIMPLE and why's it not better handled there? Do you mean in match.pd, i'll try that. C++ FE doesn't support _FLoat16, and the plac

Re: [PATCH 2/4] allow poisoning input_location in ranges it should not be used

2021-07-01 Thread Trevor Saunders
On Wed, Jun 30, 2021 at 09:09:33PM +0200, Richard Biener wrote: > On June 30, 2021 2:33:30 PM GMT+02:00, Trevor Saunders > wrote: > >On Wed, Jun 30, 2021 at 11:00:37AM +0200, Richard Biener wrote: > >> On Wed, Jun 30, 2021 at 7:37 AM Trevor Saunders > > wrote: > >> > > >> > This makes it possible

[PATCH] tree-optimization/101278 - handle self-use in DSE analysis

2021-07-01 Thread Richard Biener
DSE store classification short-cuts the to-be classified stmt itself from chaining but fails to first check whether the store uses itself which can be the case when it is a call with the LHS also passed by value as argument. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed. 2021-07-01

[PATCH] tree-optimization/100778 - fix placement of trapping vectorized ops

2021-07-01 Thread Richard Biener
This avoids placing possibly trapping vectorized operations where the corresponding scalar operation was possibly not executed. Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk sofar. 2021-01-07 Richard Biener PR tree-optimization/100778 * tree-vect-slp.c

Re: [ARM] PR98435: Missed optimization in expanding vector constructor

2021-07-01 Thread Prathamesh Kulkarni via Gcc-patches
On Wed, 30 Jun 2021 at 20:51, Christophe LYON wrote: > > > On 29/06/2021 12:46, Prathamesh Kulkarni wrote: > > On Mon, 28 Jun 2021 at 14:48, Christophe LYON > > wrote: > >> > >> On 28/06/2021 10:40, Kyrylo Tkachov via Gcc-patches wrote: > -Original Message- > From: Prathamesh Ku

Re: [PATCH 0/2] Initial support for AVX512FP16

2021-07-01 Thread Uros Bizjak via Gcc-patches
[Sorry for double post, gcc-patches address was wrong in original post] On Thu, Jul 1, 2021 at 7:48 AM liuhongt wrote: > > Hi: > AVX512FP16 is disclosed, refer to [1]. > There're 100+ instructions for AVX512FP16, 67 gcc patches, for the > convenience of review, we divide the 67 patches into

Re: [PATCH] Analyze niter for until-wrap condition [PR101145]

2021-07-01 Thread guojiufu via Gcc-patches
On 2021-07-01 15:22, Bin.Cheng wrote: On Thu, Jul 1, 2021 at 10:06 AM Jiufu Guo via Gcc-patches wrote: For code like: unsigned foo(unsigned val, unsigned start) { unsigned cnt = 0; for (unsigned i = start; i > val; ++i) cnt++; return cnt; } The number of iterations should be about U

[Patch]MAINTAINERS - Add myself for write after approval

2021-07-01 Thread Ankur Saini via Gcc-patches
Hi, I added myself to the MAINTAINERS file under Write After Approval Thanks - Ankur === 2021-07-01 Ankur Saini * MAINTAINERS: Add myself for write after approval. --- diff --git a/MAINTAINERS b/MAINTAINERS index 468d83f708e..4d6ac9c5765 100644 --- a/MAINTAINERS +++ b

[PATCH] tree-optimization/101280 - revise interchange fix for PR101173

2021-07-01 Thread Richard Biener
The following revises the original fix for PR101173 to correctly check for a reversed dependence rather than disallowing a zero distance. It also adds a check from TSVC which asks for this kind of interchange (but with a valid dependence). Bootstrapped and tested on x86_64-unknown-linux-gnu, push

Re: [PATCH PR100740]Fix overflow check in simplifying exit cond comparing two IVs.

2021-07-01 Thread Richard Biener via Gcc-patches
On Mon, Jun 7, 2021 at 4:35 PM Richard Biener wrote: > > On Sun, Jun 6, 2021 at 12:01 PM Bin.Cheng wrote: > > > > On Wed, Jun 2, 2021 at 3:28 PM Richard Biener via Gcc-patches > > wrote: > > > > > > On Tue, Jun 1, 2021 at 4:00 PM bin.cheng via Gcc-patches > > > wrote: > > > > > > > > Hi, > > >

Re: [PATCH] Port GCC documentation to Sphinx

2021-07-01 Thread Martin Liška
On 6/30/21 5:43 PM, Eli Zaretskii wrote: Cc: jos...@codesourcery.com, g...@gcc.gnu.org, gcc-patches@gcc.gnu.org From: Martin Liška Date: Wed, 30 Jun 2021 16:04:32 +0200 Thanks, but does that mean @var will no longer stand out in the produced Info format? That'd be sub-optimal, I think, becaus

Re: [PATCH] Analyze niter for until-wrap condition [PR101145]

2021-07-01 Thread Richard Biener
On Thu, 1 Jul 2021, Jiufu Guo wrote: > For code like: > unsigned foo(unsigned val, unsigned start) > { > unsigned cnt = 0; > for (unsigned i = start; i > val; ++i) > cnt++; > return cnt; > } > > The number of iterations should be about UINT_MAX - start. For unsigned foo(unsigned val,

Re: [PATCH 0/2] Initial support for AVX512FP16

2021-07-01 Thread H.J. Lu via Gcc-patches
On Thu, Jul 1, 2021 at 4:10 AM Uros Bizjak wrote: > > [Sorry for double post, gcc-patches address was wrong in original post] > > On Thu, Jul 1, 2021 at 7:48 AM liuhongt wrote: > > > > Hi: > > AVX512FP16 is disclosed, refer to [1]. > > There're 100+ instructions for AVX512FP16, 67 gcc patches

Re: [PATCH v6 1/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-07-01 Thread H.J. Lu via Gcc-patches
Hi Uros, On Thu, Jul 1, 2021 at 1:32 AM Hongtao Liu wrote: > > On Tue, Jun 29, 2021 at 6:16 AM H.J. Lu wrote: > > > > 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTOR > > operands to vector broadcast from an integer with AVX. > > 2. Add ix86_gen_scratch_sse_rtx to return a

Re: [PATCH 56/62] AVX512FP16: Optimize (_Float16) sqrtf ((float) f16) to sqrtf16 (f16).

2021-07-01 Thread Richard Biener via Gcc-patches
On Thu, Jul 1, 2021 at 12:18 PM Hongtao Liu wrote: > > On Thu, Jul 1, 2021 at 5:51 PM Richard Biener via Gcc-patches > wrote: > > > > On Thu, Jul 1, 2021 at 9:20 AM liuhongt via Gcc-patches > > wrote: > > > > How does this look on GIMPLE and why's it not better handled there? > Do you mean in m

Re: [PATCH] Port GCC documentation to Sphinx

2021-07-01 Thread Martin Liška
On 6/30/21 3:09 PM, Eli Zaretskii wrote: Cc: jos...@codesourcery.com, g...@gcc.gnu.org, gcc-patches@gcc.gnu.org From: Martin Liška Date: Wed, 30 Jun 2021 12:11:03 +0200 (Admittedly, Emacs by default hides some of the text of a cross-reference, but not hiding them in this case produces an even

Re: [PATCH 2/4] allow poisoning input_location in ranges it should not be used

2021-07-01 Thread Richard Biener via Gcc-patches
On Thu, Jul 1, 2021 at 12:24 PM Trevor Saunders wrote: > > On Wed, Jun 30, 2021 at 09:09:33PM +0200, Richard Biener wrote: > > On June 30, 2021 2:33:30 PM GMT+02:00, Trevor Saunders > > wrote: > > >On Wed, Jun 30, 2021 at 11:00:37AM +0200, Richard Biener wrote: > > >> On Wed, Jun 30, 2021 at 7:3

Re: [PATCH 2/4] allow poisoning input_location in ranges it should not be used

2021-07-01 Thread Richard Biener via Gcc-patches
On Thu, Jul 1, 2021 at 12:16 PM Trevor Saunders wrote: > > On Wed, Jun 30, 2021 at 11:13:23AM -0400, David Malcolm wrote: > > On Wed, 2021-06-30 at 01:35 -0400, Trevor Saunders wrote: > > > This makes it possible to assert if input_location is used during the > > > lifetime > > > of a scope. This

Re: [PATCH 0/2] Initial support for AVX512FP16

2021-07-01 Thread Richard Biener via Gcc-patches
On Thu, Jul 1, 2021 at 2:41 PM H.J. Lu via Gcc-patches wrote: > > On Thu, Jul 1, 2021 at 4:10 AM Uros Bizjak wrote: > > > > [Sorry for double post, gcc-patches address was wrong in original post] > > > > On Thu, Jul 1, 2021 at 7:48 AM liuhongt wrote: > > > > > > Hi: > > > AVX512FP16 is disclos

Re: [PATCH 0/2] Initial support for AVX512FP16

2021-07-01 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 1, 2021 at 2:40 PM H.J. Lu wrote: > > On Thu, Jul 1, 2021 at 4:10 AM Uros Bizjak wrote: > > > > [Sorry for double post, gcc-patches address was wrong in original post] > > > > On Thu, Jul 1, 2021 at 7:48 AM liuhongt wrote: > > > > > > Hi: > > > AVX512FP16 is disclosed, refer to [1]

Re: [PATCH 0/2] Initial support for AVX512FP16

2021-07-01 Thread Jakub Jelinek via Gcc-patches
On Thu, Jul 01, 2021 at 02:58:01PM +0200, Richard Biener wrote: > > The main issue is complex _Float16 functions in libgcc. If _Float16 doesn't > > require -mavx512fp16, we need to compile complex _Float16 functions in > > libgcc without -mavx512fp16. Complex _Float16 performance is very > > impo

[PATCH] Change the type of predicates to bool.

2021-07-01 Thread Uros Bizjak via Gcc-patches
On Wed, Jun 30, 2021 at 12:50 PM Richard Biener wrote: > > On Wed, Jun 30, 2021 at 10:47 AM Uros Bizjak via Gcc-patches > wrote: > > > > This RFC patch changes the type of predicates to bool. However, some > > of the targets (e.g. x86) use indirect functions to call the > > predicates, so without

Re: [PATCH] Change the type of predicates to bool.

2021-07-01 Thread Richard Biener via Gcc-patches
On Thu, Jul 1, 2021 at 3:07 PM Uros Bizjak wrote: > > On Wed, Jun 30, 2021 at 12:50 PM Richard Biener > wrote: > > > > On Wed, Jun 30, 2021 at 10:47 AM Uros Bizjak via Gcc-patches > > wrote: > > > > > > This RFC patch changes the type of predicates to bool. However, some > > > of the targets (e.

Re: [PATCH] Optimize macro: make it more predictable

2021-07-01 Thread Martin Liška
On 10/23/20 1:47 PM, Martin Liška wrote: Hey. Hello. I deferred the patch for GCC 12. Since the time, I messed up with options I feel more familiar with the option handling. So ... This is a follow-up of the discussion that happened in thread about  no_stack_protector attribute: https://gcc

[RS6000] Adjust testcases for power10 instructions

2021-07-01 Thread Alan Modra via Gcc-patches
Bootstrapped and regression tested powerpc64le-linux power9 and power10. OK for mainline? * lib/target-supports.exp (check_effective_target_has_arch_pwr10): New. * gcc.dg/pr56727-2.c, gcc.dg/vect/costmodel/ppc/costmodel-bb-slp-9a-pr63175.c, gcc.target/powerpc/fold-

Re: [PATCH] Port GCC documentation to Sphinx

2021-07-01 Thread Eli Zaretskii via Gcc-patches
> Cc: jos...@codesourcery.com, g...@gcc.gnu.org, gcc-patches@gcc.gnu.org > From: Martin Liška > Date: Thu, 1 Jul 2021 14:44:10 +0200 > > > It helps some, but not all of the issues disappear. For example, > > stuff like this is still hard to read: > > > >To select this standard in GCC, use o

Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-07-01 Thread Qing Zhao via Gcc-patches
> On Jul 1, 2021, at 1:48 AM, Richard Biener wrote: > > On Wed, Jun 30, 2021 at 9:15 PM Qing Zhao via Gcc-patches > wrote: >> >> >> >>> On Jun 30, 2021, at 1:59 PM, Richard Biener wrote: >>> >>> On June 30, 2021 8:07:43 PM GMT+02:00, Qing Zhao >>> wrote: > On Jun 30, 202

Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-07-01 Thread Richard Biener via Gcc-patches
On Thu, Jul 1, 2021 at 3:45 PM Qing Zhao wrote: > > > > > On Jul 1, 2021, at 1:48 AM, Richard Biener > > wrote: > > > > On Wed, Jun 30, 2021 at 9:15 PM Qing Zhao via Gcc-patches > > wrote: > >> > >> > >> > >>> On Jun 30, 2021, at 1:59 PM, Richard Biener wrote: > >>> > >>> On June 30, 2021 8:07

Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-07-01 Thread Richard Sandiford via Gcc-patches
Qing Zhao writes: >> On Jul 1, 2021, at 1:48 AM, Richard Biener >> wrote: >> >> On Wed, Jun 30, 2021 at 9:15 PM Qing Zhao via Gcc-patches >> wrote: >>> >>> >>> On Jun 30, 2021, at 1:59 PM, Richard Biener wrote: On June 30, 2021 8:07:43 PM GMT+02:00, Qing Zhao wrote:

Re: [PATCH v6 1/2] x86: Convert CONST_WIDE_INT/CONST_VECTOR to broadcast

2021-07-01 Thread Uros Bizjak via Gcc-patches
On Thu, Jul 1, 2021 at 2:42 PM H.J. Lu wrote: > > Hi Uros, > > On Thu, Jul 1, 2021 at 1:32 AM Hongtao Liu wrote: > > > > On Tue, Jun 29, 2021 at 6:16 AM H.J. Lu wrote: > > > > > > 1. Update move expanders to convert the CONST_WIDE_INT and CONST_VECTOR > > > operands to vector broadcast from an i

Re: [PATCH] Analyze niter for until-wrap condition [PR101145]

2021-07-01 Thread guojiufu via Gcc-patches
On 2021-07-01 20:35, Richard Biener wrote: On Thu, 1 Jul 2021, Jiufu Guo wrote: For code like: unsigned foo(unsigned val, unsigned start) { unsigned cnt = 0; for (unsigned i = start; i > val; ++i) cnt++; return cnt; } The number of iterations should be about UINT_MAX - start. For

Re: [PATCH] Port GCC documentation to Sphinx

2021-07-01 Thread Martin Liška
On 7/1/21 3:33 PM, Eli Zaretskii wrote: Cc: jos...@codesourcery.com, g...@gcc.gnu.org, gcc-patches@gcc.gnu.org From: Martin Liška Date: Thu, 1 Jul 2021 14:44:10 +0200 It helps some, but not all of the issues disappear. For example, stuff like this is still hard to read: To select this st

Re: HELP!! How to inhibit optimizations applied to .DEFERRED_INIT argument?

2021-07-01 Thread Michael Matz
Hello, I haven't followed this thread too closely, in particular I haven't seen why the current form of the .DEFERRED_INIT call was chosen or suggested, but it triggered my "well, that's obviously wrong" gut feeling; so sorry for stating something which might be obvious to thread participants h

Re: [PATCH 3/4] remove %K from error() calls in the aarch64/arm back ends (PR 98512)

2021-07-01 Thread Martin Sebor via Gcc-patches
On 7/1/21 2:01 AM, Christophe LYON wrote: On 30/06/2021 21:56, Martin Sebor via Gcc-patches wrote: On 6/11/21 8:46 AM, Martin Sebor wrote: On 6/11/21 3:58 AM, Richard Sandiford wrote: Martin Sebor via Gcc-patches writes: diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 7b37e1b

Re: [PATCH] tree-optimization/101280 - revise interchange fix for PR101173

2021-07-01 Thread Michael Matz
Hello, On Thu, 1 Jul 2021, Richard Biener wrote: > diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-interchange.cc > index 43045c5455e..43ef112a2d0 100644 > --- a/gcc/gimple-loop-interchange.cc > +++ b/gcc/gimple-loop-interchange.cc > @@ -1043,8 +1043,11 @@ tree_loop_interchange::val

[PATCH] Return true/false instead of 1/0 from generic predicates.

2021-07-01 Thread Uros Bizjak via Gcc-patches
No functional changes. 2021-07-01 Uroš Bizjak gcc/ * recog.c (general_operand): Return true/false instead of 1/0. (register_operand): Ditto. (immediate_operand): Ditto. (const_int_operand): Ditto. (const_scalar_int_operand): Ditto. (const_double_operand): Ditto. (pu

[PATCH] i386: Return true/false instead of 1/0 from predicates.

2021-07-01 Thread Uros Bizjak via Gcc-patches
No functional changes. 2021-07-01 Uroš Bizjak gcc/ * config/i386/predicates.md (ix86_endbr_immediate_operand): Return true/false instead of 1/0. (movq_parallel): Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/confi

Re: [PATCH] Return true/false instead of 1/0 from generic predicates.

2021-07-01 Thread Jeff Law via Gcc-patches
On 7/1/2021 8:55 AM, Uros Bizjak via Gcc-patches wrote: No functional changes. 2021-07-01 Uroš Bizjak gcc/ * recog.c (general_operand): Return true/false instead of 1/0. (register_operand): Ditto. (immediate_operand): Ditto. (const_int_operand): Ditto. (const_scal

Re: [PATCH] Port GCC documentation to Sphinx

2021-07-01 Thread Michael Matz
Hello, On Thu, 1 Jul 2021, Martin Liška wrote: > On 7/1/21 3:33 PM, Eli Zaretskii wrote: > > > Cc: jos...@codesourcery.com, g...@gcc.gnu.org, gcc-patches@gcc.gnu.org > > > From: Martin Liška > > > Date: Thu, 1 Jul 2021 14:44:10 +0200 > > > > > > > It helps some, but not all of the issues disapp

Re: [PATCH] PING implement pre-c++20 contracts

2021-07-01 Thread Jason Merrill via Gcc-patches
On 6/26/21 10:23 AM, Andrew Sutton wrote: Hi Jason, I ended up taking over this work from Jeff (CC'd on his existing email address). I scraped all the contracts changes into one big patch against master. See attached. The ChangeLog.contracts files list the sum of changes for the patch, not the f

[PATCH] [DWARF] Fix hierarchy of debug information for offload kernels.

2021-07-01 Thread Hafiz Abid Qadeer
Currently, if we look at the debug information for offload kernel regions, it looks something like this: void foo (void) { #pragma acc kernels { } } DW_TAG_compile_unit DW_AT_name("") DW_TAG_subprogram // notional parent function (foo) with no code range DW_TAG_subprogram // of

Re: [PATCH] c++: unqualified member template in constraint [PR101247]

2021-07-01 Thread Jason Merrill via Gcc-patches
On 6/30/21 5:27 PM, Patrick Palka wrote: Here any_template_parm_r is failing to mark the template parameters that're implicitly used by the unqualified use of 'd' inside the constraint, because the code to do so assumes each level of a template parameter list points to the corresponding primary t

Re: [PATCH] Add gnu::diagnose_as attribute

2021-07-01 Thread Jason Merrill via Gcc-patches
On 7/1/21 5:28 AM, Matthias Kretz wrote: On Tuesday, 22 June 2021 22:12:42 CEST Jason Merrill wrote: On 6/22/21 4:01 PM, Matthias Kretz wrote: On Tuesday, 22 June 2021 21:52:16 CEST Jason Merrill wrote: For alias templates, you probably want the attribute only on the templated class, not on th

[PATCH v5 03/11] x86: Avoid stack realignment when copying data

2021-07-01 Thread H.J. Lu via Gcc-patches
To avoid stack realignment, use SCRATCH_SSE_REG to copy data from one memory location to another. gcc/ * config/i386/i386-expand.c (ix86_expand_vector_move): Call ix86_gen_scratch_sse_rtx to get a scratch SSE register to copy data from one memory location to another. gcc/

[PATCH v5 01/11] Rewrite memset with TARGET_GEN_MEMSET_SCRATCH_RTX

2021-07-01 Thread H.J. Lu via Gcc-patches
1. Rewrite builtin_memset_read_str/builtin_memset_gen_str to use vector broadcast to duplicate QI value to TI/OI/XI value for memmset. 2. Add TARGET_GEN_MEMSET_SCRATCH_RTX to allow the backend to use a hard scratch register to avoid stack realignment when expanding memset. PR middle-end/90

[PATCH v5 02/11] x86: Add TARGET_GEN_MEMSET_SCRATCH_RTX

2021-07-01 Thread H.J. Lu via Gcc-patches
Define TARGET_GEN_MEMSET_SCRATCH_RTX to ix86_gen_scratch_sse_rtx to return a scratch SSE register for memset. gcc/ PR middle-end/90773 * config/i386/i386.c (TARGET_GEN_MEMSET_SCRATCH_RTX): New. gcc/testsuite/ PR middle-end/90773 * gcc.target/i386/pr90773-15.c: Ne

[PATCH v5 08/11] x86: Also pass -mno-avx to cold-attribute-1.c

2021-07-01 Thread H.J. Lu via Gcc-patches
Also pass -mno-avx to pr72839.c to avoid copying data with YMM or ZMM registers. * gcc.target/i386/cold-attribute-1.c: Also pass -mno-avx. --- gcc/testsuite/gcc.target/i386/cold-attribute-1.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/i386

[PATCH v5 00/11] Allow TImode/OImode/XImode in op_by_pieces operations

2021-07-01 Thread H.J. Lu via Gcc-patches
Changes in the v5 patches: 1. Add TARGET_GEN_MEMSET_SCRATCH_RTX to allow the backend to use a hard scratch register to avoid stack realignment when expanding memset. 2. Use vec_duplicate, instead of adding TARGET_READ_MEMSET_VALUE and TARGET_GEN_MEMSET_VALUE, to expand memset if available. Change

[PATCH v5 05/11] x86: Add AVX2 tests for PR middle-end/90773

2021-07-01 Thread H.J. Lu via Gcc-patches
PR middle-end/90773 * gcc.target/i386/pr90773-20.c: New test. * gcc.target/i386/pr90773-21.c: Likewise. * gcc.target/i386/pr90773-22.c: Likewise. * gcc.target/i386/pr90773-23.c: Likewise. * gcc.target/i386/pr90773-26.c: Likewise. --- gcc/testsuite/gc

[PATCH v5 07/11] x86: Also pass -mno-avx to pr72839.c

2021-07-01 Thread H.J. Lu via Gcc-patches
Also pass -mno-avx to pr72839.c to avoid copying data with YMM or ZMM registers. * gcc.target/i386/pr72839.c: Also pass -mno-avx. --- gcc/testsuite/gcc.target/i386/pr72839.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/i386/pr72839.c b/gcc/

[PATCH v5 04/11] x86: Update piecewise move and store

2021-07-01 Thread H.J. Lu via Gcc-patches
We can use TImode/OImode/XImode integers for piecewise move and store. 1. Define MAX_MOVE_MAX to 64, which is the constant maximum number of bytes that a single instruction can move quickly between memory and registers or between two memory locations. 2. Define MOVE_MAX to MOVE_MAX_PIECES, which i

[PATCH v5 09/11] x86: Also pass -mno-avx to sw-1.c for ia32

2021-07-01 Thread H.J. Lu via Gcc-patches
Also pass -mno-avx to sw-1.c for ia32 since copying data with YMM or ZMM registers disables shrink-wrapping when the second argument is passed on stack. * gcc.target/i386/sw-1.c: Also pass -mno-avx for ia32. --- gcc/testsuite/gcc.target/i386/sw-1.c | 1 + 1 file changed, 1 insertion(+) d

[PATCH v5 10/11] x86: Update gcc.target/i386/incoming-11.c

2021-07-01 Thread H.J. Lu via Gcc-patches
Expect no stack realignment since we no longer realign stack when copying data. * gcc.target/i386/incoming-11.c: Expect no stack realignment. --- gcc/testsuite/gcc.target/i386/incoming-11.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.target/i386/i

  1   2   >