from:"Hu, Lin1"

[PATCH] Support Intel USER_MSR

2023-10-10 Thread Hu, Lin1

This patch aims to support Intel USER_MSR. gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_available_features): Detect USER_MSR. * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_USER_MSR_SET): New. (OPTION_MASK_ISA2_USER_MSR_UNSET): Ditto. (ix86_ha

RE: [PATCH] Support Intel USER_MSR

2023-10-10 Thread Hu, Lin1

There are some typos In /gcc/doc/extend.texi and /gcc/doc/invoke.texi. They should be USER_MSR, not UMSR. I have modified them in my branch. -Original Message- From: Hu, Lin1 Sent: Tuesday, October 10, 2023 3:47 PM To: gcc-patches@gcc.gnu.org Cc: Liu, Hongtao ; ubiz...@gmail.com

[PATCH] Fix testcases that are raised by support -mevex512

2023-10-11 Thread Hu, Lin1

Hi, all This patch aims to fix some scan-asm fail of pr89229-{5,6,7}b.c since we emit scalar vmov{s,d} here, when trying to use x/ymm 16+ w/o avx512vl but with avx512f+evex512. If everyone has no objection to the modification of this behavior, then we tend to solve these failures by modifying the

[PATCH] Avoid generate vblendps with ymm16+

2023-11-08 Thread Hu, Lin1

This patch aims to avoid generate vblendps with ymm16+, And have bootstrapped and tested on x86_64-pc-linux-gnu{-m32,-m64}. Ok for trunk? gcc/ChangeLog: PR target/112435 * config/i386/sse.md: Adding constraints to restrict the generation of vblendps. gcc/testsuite/ChangeL

RE: [PATCH] Avoid generate vblendps with ymm16+

2023-11-12 Thread Hu, Lin1

On Saturday, November 11, 2023 4:11 AM, Jakub Jelinek wrote: > On Thu, Nov 09, 2023 at 03:27:11PM +0800, Hongtao Liu wrote: > > On Thu, Nov 9, 2023 at 3:15 PM Hu, Lin1 wrote: > > > > > > This patch aims to avoid generate vblendps with ymm16+, And have > > >

[PATCH] i386: Fix CPUID of USER_MSR.

2023-11-28 Thread Hu, Lin1

Hi, all This patch aims to fix the wrong CPUID of USER_MSR, its correct CPUID is (0x7, 0x1).EDX[15], But I set it as (0x7, 0x0).EDX[15]. And the patch modefied testcase for give the user a better example. It has been bootstrapped and regtested on x86-64-pc-linux-gnu, OK for trunk? BR, Lin gcc/C

[PATCH 01/18] Initial support for -mevex512

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_EVEX512_SET): New. (OPTION_MASK_ISA2_EVEX512_UNSET): Ditto. (ix86_handle_option): Handle EVEX512. * config/i386/i386-c.cc (ix86_target_macros_internal): Ditto.

[PATCH 00/18] Support -mevex512 for AVX512

2023-09-21 Thread Hu, Lin1

Hi all, After previous discussion, instead of supporting option -mavx10.1, we will first introduct option -m[no-]evex512, which will enable/disable 512 bit register and 64 bit mask register. It will not change the current option behavior since if AVX512F is enabled with no evex512 option specifie

[PATCH 08/18] [PATCH 2/5] Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_EVEX512. --- gcc/config/i386/i386-builtin.def | 94 1 file changed, 47 insertions(+), 47 deletions(-) diff --git a/gcc/config/i386/i386-builtin.def b

[PATCH 16/18] Support -mevex512 for AVX512{IFMA, VBMI, VNNI, BF16, VPOPCNTDQ, VBMI2, BITALG, VP2INTERSECT}, VAES, GFNI, VPCLMULQDQ intrins

2023-09-21 Thread Hu, Lin1

. (vpdpwssds_v16si): Ditto. (VI48_AVX512VP2VL): Ditto. (avx512vp2intersect_2intersectv16si): Ditto. (VF_AVX512BF16VL): Ditto. (VF1_AVX512_256): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr90096.c: Adjust error message. Co-authored-by: Hu, Lin1

[PATCH 09/18] [PATCH 3/5] Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_EVEX512. --- gcc/config/i386/i386-builtin.def | 226 +++ 1 file changed, 113 insertions(+), 113 deletions(-) diff --git a/gcc/config/i386/i386-builtin.def

[PATCH 14/18] Support -mevex512 for AVX512DQ intrins

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_expand_sse2_mulvxdi3): Add TARGET_EVEX512 for 512 bit usage. * config/i386/i386.cc (standard_sse_constant_opcode): Ditto. * config/i386/sse.md (VF1_VF2_AVX512DQ): Ditto. (VF1_128_256VL):

[PATCH 18/18] Allow -mno-evex512 usage

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * config/i386/i386.opt: Allow -mno-evex512. gcc/testsuite/ChangeLog: * gcc.target/i386/noevex512-1.c: New test. * gcc.target/i386/noevex512-2.c: Ditto. * gcc.target/i386/noevex512-3.c: Ditto. --- gcc/config/i386/i386.opt

[PATCH 03/18] [PATCH 2/5] Push evex512 target for 512 bit intrins

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * config/i386/avx512dqintrin.h: Add evex512 target for 512 bit intrins. --- gcc/config/i386/avx512dqintrin.h | 1840 +++--- 1 file changed, 926 insertions(+), 914 deletions(-) diff --git a/gcc/config/i386/avx512dqintrin

[PATCH 17/18] Support -mevex512 for AVX512FP16 intrins

2023-09-21 Thread Hu, Lin1

. (VEC_PERM_AVX2): Ditto. Co-authored-by: Hu, Lin1 --- gcc/config/i386/sse.md | 44 -- 1 file changed, 21 insertions(+), 23 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index a5a95b9de66..25d53e15dce 100644 --- a/gcc/config/i386/sse.md

[PATCH 07/18] [PATCH 1/5] Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_EVEX512. * config/i386/i386-builtins.cc (ix86_init_mmx_sse_builtins): Ditto. --- gcc/config/i386/i386-builtin.def | 648 +++ gcc/config/i38

[PATCH 05/18] [PATCH 4/5] Push evex512 target for 512 bit intrins

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * config.gcc: Add avx512bitalgvlintrin.h. * config/i386/avx5124fmapsintrin.h: Add evex512 target for 512 bit intrins. * config/i386/avx5124vnniwintrin.h: Ditto. * config/i386/avx512bf16intrin.h: Ditto. * config/i3

[PATCH 10/18] [PATCH 4/5] Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_EVEX512. --- gcc/config/i386/i386-builtin.def | 188 +++ 1 file changed, 94 insertions(+), 94 deletions(-) diff --git a/gcc/config/i386/i386-builtin.def b

[PATCH 15/18] Support -mevex512 for AVX512BW intrins

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/Changelog: * config/i386/i386-expand.cc (ix86_expand_vector_init_duplicate): Make sure there is EVEX512 enabled. (ix86_expand_vecop_qihi2): Refuse V32QI->V32HI when no EVEX512. * config/i386/i386.cc (ix86_hard_regno_mode_ok): Disable 64 bit

[PATCH 04/18] [PATCH 3/5] Push evex512 target for 512 bit intrins

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * config/i386/avx512bwintrin.h: Add evex512 target for 512 bit intrins. --- gcc/config/i386/avx512bwintrin.h | 291 --- 1 file changed, 153 insertions(+), 138 deletions(-) diff --git a/gcc/config/i386/avx512bwintrin

[PATCH 13/18] Support -mevex512 for AVX512F intrins

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * config/i386/i386-builtins.cc (ix86_vectorize_builtin_gather): Disable 512 bit gather when !TARGET_EVEX512. * config/i386/i386-expand.cc (ix86_valid_mask_cmp_mode): Add TARGET_EVEX512. (ix86_expand_int_sse_cmp):

[PATCH 12/18] Disable zmm register and 512 bit libmvec call when !TARGET_EVEX512

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * config/i386/i386-expand.cc (ix86_broadcast_from_constant): Disable zmm broadcast for !TARGET_EVEX512. * config/i386/i386-options.cc (ix86_option_override_internal): Do not use PVW_512 when no-evex512. (ix86_simd_clone_a

[PATCH 11/18] [PATCH 5/5] Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins

2023-09-21 Thread Hu, Lin1

From: Haochen Jiang gcc/ChangeLog: * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_EVEX512. --- gcc/config/i386/i386-builtin.def | 156 +++ 1 file changed, 78 insertions(+), 78 deletions(-) diff --git a/gcc/config/i386/i386-builtin.def b

RE: [PATCH 00/18] Support -mevex512 for AVX512

2023-09-27 Thread Hu, Lin1

__; #endif If we understand correctly, we'll consider the request. But since we're about to have a vacation, follow-up replies may be a bit slower. BRs, Lin -Original Message- From: ZiNgA BuRgA Sent: Thursday, September 28, 2023 8:32 AM To: Hu, Lin1 ; gcc-patches@gcc.gnu.org

[PATCH] i386: Refactor vcvttps2qq/vcvtqq2ps patterns.

2024-06-26 Thread Hu, Lin1

Hi, all This patch aims to refactor vcvttps2qq/vcvtqq2ps patterns for remove redundant round_*_modev8sf_condition. Bootstrapped and regtested on x86-64-linux-gnu, OK for trunk? BRs, Lin gcc/ChangeLog: * config/i386/sse.md (float2): Refactor the pattern. (unspe

[PATCH] vect: Fix ICE caused by missing check for TREE_CODE == SSA_NAME

2024-07-03 Thread Hu, Lin1

Hi, all I forgot to check if the tree's code is SSA_NAME. Have modified. Bootstrapped and regtested on {x86-64, aarch64}-linux-gnu, OK for trunk? BRs, Lin 2024-07-03 Hu, Lin1 Andrew Pinski gcc/ChangeLog: PR tree-optimization/115753 * tree-vect-stm

[PATCH] i386: Refactor ssedoublemode

2024-07-04 Thread Hu, Lin1

Hi, all ssedoublemode's double should mean double type, like SI -> DI. And we need to refactor some patterns with instead of . Bootstrapped and regtested on x86-64-linux-gnu, OK for trunk? BRs, Lin gcc/ChangeLog: * config/i386/sse.md (ssedoublemode): Fix the mode_attr. --- gcc/config

[PATCH v2] i386: Refactor ssedoublemode

2024-07-05 Thread Hu, Lin1

I Modified the changelog and comments. ssedoublemode's double should mean double type, like SI -> DI. And we need to refactor some patterns with instead of . BRs, Lin gcc/ChangeLog: * config/i386/sse.md (ssedoublemode): Remove mappings to double of elements and mapping vector

[PATCH] i386: extend trunc{128}2{16,32,64}'s scope.

2024-07-14 Thread Hu, Lin1

Hi, all Based on actual usage, trunc{128}2{16,32,64} use some instructions from sse/sse3, so extend their scope to extend the scope of optimization. Bootstraped and regtest on x86-64-linux-gnu, OK for trunk? BRs, Lin gcc/ChangeLog: PR target/107432 * config/i386/sse.md

RE: [PATCH] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-05-14 Thread Hu, Lin1

> -Original Message- > From: Richard Biener > Sent: Tuesday, May 14, 2024 8:23 PM > To: Hu, Lin1 > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > ubiz...@gmail.com > Subject: RE: [PATCH] vect: generate suitable convert insn for int -> int, > float -> >

[PATCH 0/3] Optimize __builtin_convertvector for x86-64-v4 and

2024-05-22 Thread Hu, Lin1

64-pc-linux-gnu. BRs, Lin Hu, Lin1 (3): vect: generate suitable convert insn for int -> int, float -> float and int <-> float. vect: Support v4hi -> v4qi. vect: support direct conversion under x86-64-v3. gcc/config/i386/i386-expand.cc | 47 +++- gcc/confi

[PATCH 2/3] vect: Support v4hi -> v4qi.

2024-05-22 Thread Hu, Lin1

gcc/ChangeLog: PR target/107432 * config/i386/mmx.md (truncv4hiv4qi2): New define_insn. gcc/testsuite/ChangeLog: PR target/107432 * gcc.target/i386/pr107432-6.c: Add test. --- gcc/config/i386/mmx.md | 10 ++ gcc/testsuite/gcc.target/i386/pr107432-1.c

[PATCH 3/3] vect: support direct conversion under x86-64-v3.

2024-05-22 Thread Hu, Lin1

gcc/ChangeLog: PR 107432 * config/i386/i386-expand.cc (ix86_expand_trunc_with_avx2_noavx512f): New function for generate a series of suitable insn. * config/i386/i386-protos.h (ix86_expand_trunc_with_avx2_noavx512f): Define new function. * config/i38

[PATCH 1/3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-05-22 Thread Hu, Lin1

gcc/ChangeLog: PR target/107432 * tree-vect-generic.cc (supportable_indirect_narrowing_operation): New function for support indirect narrowing convert. (supportable_indirect_widening_operation): New function for support indirect widening convert.

RE: [PATCH 3/3] vect: support direct conversion under x86-64-v3.

2024-05-23 Thread Hu, Lin1

> -Original Message- > From: Hongtao Liu > Sent: Thursday, May 23, 2024 2:42 PM > To: Hu, Lin1 > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > ubiz...@gmail.com; rguent...@suse.de > Subject: Re: [PATCH 3/3] vect: support direct conversion under x86-64-v3. > >

[PATCH] i386: Optimize EQ/NE comparison between avx512 kmask and -1.

2024-05-28 Thread Hu, Lin1

Hi all, This patch aims to acheive EQ/NE comparison between avx512 kmask and -1 by using kxortest with checking CF. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,-m64}. Ok for trunk? BRs, Lin gcc/ChangeLog: PR target/113609 * config/i386/sse.md (*kortest_cmp_se

[PATCH 2/3 v2] vect: Support v4hi -> v4qi.

2024-05-29 Thread Hu, Lin1

Exclude add TARGET_MMX_WITH_SSE, I merge two patterns. BRs, Lin gcc/ChangeLog: PR target/107432 * config/i386/mmx.md (VI2_32_64): New mode iterator. (mmxhalfmode): New mode atter. (mmxhalfmodelower): Ditto. (truncv2hiv2qi2): Extend mode v4hi and change name from trunc

[PATCH 3/3 v2] vect: support direct conversion under x86-64-v3.

2024-05-29 Thread Hu, Lin1

According to hongtao's suggestion, I support some trunc in mmx.md under x86-64-v3, and optimize ix86_expand_trunc_with_avx2_noavx512f. BRs, Lin gcc/ChangeLog: PR 107432 * config/i386/i386-expand.cc (ix86_expand_trunc_with_avx2_noavx512f): New function for generate a serie

[PATCH] i386: Handle target of __builtin_ia32_cmp[p|s][s|d] from avx into sse/sse2/avx

2024-05-29 Thread Hu, Lin1

Hi, all This patch aims to extend __builtin_ia32_cmp[p|s][s|d] from avx to sse/sse2/avx, where its immediate is in range of [0, 7]. Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? BRs, Lin gcc/ChangeLog: * config/i386/avxintrin.h: Move cmp[p|s][s|d] to [e|x]mmintrin.h,

RE: [PATCH 1/3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-05-31 Thread Hu, Lin1

> -Original Message- > From: Richard Biener > Sent: Wednesday, May 29, 2024 5:41 PM > To: Hu, Lin1 > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > ubiz...@gmail.com > Subject: Re: [PATCH 1/3] vect: generate suitable convert insn for int -> int, > float &

RE: [PATCH 1/3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-03 Thread Hu, Lin1

> -Original Message- > From: Richard Biener > Sent: Friday, May 31, 2024 8:41 PM > To: Hu, Lin1 > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > ubiz...@gmail.com > Subject: RE: [PATCH 1/3] vect: generate suitable convert insn for int -> int, > float > ->

RE: [PATCH 1/3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-03 Thread Hu, Lin1

> -Original Message- > From: Richard Biener > Sent: Monday, June 3, 2024 5:03 PM > To: Hu, Lin1 > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > ubiz...@gmail.com > Subject: RE: [PATCH 1/3] vect: generate suitable convert insn for int -> int, > float > ->

[PATCH 1/3 v3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-10 Thread Hu, Lin1

I wrap a part of code about indirect conversion. The API refers to supportable_narrowing/widening_operations. BRs, Lin gcc/ChangeLog: PR target/107432 * tree-vect-generic.cc (expand_vector_conversion): Support convert for int -> int, float -> float and int <-> fl

[PATCH] i386: Refine all cvtt* instructions with UNSPEC instead of FIX/UNSIGNED_FIX.

2024-06-13 Thread Hu, Lin1

Hi, all This patch aims to refine all cvtt* instructions with UNSPEC instead of FIX/UNSIGNED_FIX. Because the intrinsics should behave as documented. Bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? BRs, Lin gcc/ChangeLog: PR target/115161 * config/i386/i386-bui

RE: [PATCH 1/3 v3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-16 Thread Hu, Lin1

Ping this thread. BRs, Lin -Original Message- From: Hu, Lin1 Sent: Tuesday, June 11, 2024 2:49 PM To: gcc-patches@gcc.gnu.org Cc: Liu, Hongtao ; ubiz...@gmail.com; rguent...@suse.de Subject: [PATCH 1/3 v3] vect: generate suitable convert insn for int -> int, float -> float a

RE: [PATCH 1/3 v3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-20 Thread Hu, Lin1

> >else if (ret_elt_bits > arg_elt_bits) > > modifier = WIDEN; > > > > + if (supportable_convert_operation (code, ret_type, arg_type, &code1)) > > +{ > > + g = gimple_build_assign (lhs, code1, arg); > > + gsi_replace (gsi, g, false); > > + return; > > +} > > Given

RE: [PATCH 1/3 v3] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-24 Thread Hu, Lin1

> -Original Message- > From: Tamar Christina > Sent: Monday, June 24, 2024 10:12 PM > To: Richard Biener ; Hu, Lin1 > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; > ubiz...@gmail.com > Subject: RE: [PATCH 1/3 v3] vect: generate suitable convert insn for int -> >

[PATCH 1/3 v4] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-24 Thread Hu, Lin1

Hi, This is the current version. I haven't made any major changes to the original code, I think it will have less impact on your code. And I think the current API is sufficient to support the mode selection you mentioned, if you have any concerns you can mention them. I can tweak it further.

[PATCH 1/3 v5] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-06-26 Thread Hu, Lin1

Hi, This is the lasted version, I modified some comments and retest the patch on x86-64-linux-gnu. I'll wait another day to see what else Tamar has to say about the API, if not I will upstream this patch tomorrow. BRs, Lin gcc/ChangeLog: PR target/107432 * tree-vect-generic.cc

[PATCH] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-05-07 Thread Hu, Lin1

Hi, all This patch aims to optimize __builtin_convertvector. We want the function can generate more efficient insn for some situations. Like v2si -> v2di. The patch has been bootstrapped and regtested on x86_64-pc-linux-gnu, OK for trunk? BRs, Lin gcc/ChangeLog: PR target/107432

[PATCH] i386: Fix some intrinsics without alignment requirements.

2024-05-07 Thread Hu, Lin1

Hi all, This patch aims to fix some intrinsics without alignment requirement, but raised runtime error's problem. Bootstrapped and tested on x86_64-linux-gnu, OK for trunk? BRs, Lin gcc/ChangeLog: PR target/84508 * config/i386/emmintrin.h (_mm_load_sd): Remove alignment

RE: [committed] testsuite: Fix up pr84508* tests [PR84508]

2024-05-09 Thread Hu, Lin1

> -Original Message- > From: Jakub Jelinek > Sent: Friday, May 10, 2024 3:04 AM > To: Hongtao Liu > Cc: Hu, Lin1 ; gcc-patches@gcc.gnu.org; Liu, Hongtao > ; ubiz...@gmail.com > Subject: [committed] testsuite: Fix up pr84508* tests [PR84508] > > On Thu, May 0

RE: [PATCH] vect: generate suitable convert insn for int -> int, float -> float and int <-> float.

2024-05-13 Thread Hu, Lin1

Do you have any advice? BRs, Lin -Original Message- From: Hu, Lin1 Sent: Wednesday, May 8, 2024 9:38 AM To: gcc-patches@gcc.gnu.org Cc: Liu, Hongtao ; ubiz...@gmail.com Subject: [PATCH] vect: generate suitable convert insn for int -> int, float -> float and int <-> flo

RE: [PATCH 2/8] i386: Optimize ordered and nonequal

2024-09-02 Thread Hu, Lin1

> -Original Message- > From: Jakub Jelinek > Sent: Tuesday, September 3, 2024 2:56 AM > To: Andrew Pinski > Cc: Jiang, Haochen ; Richard Biener > ; gcc-patches@gcc.gnu.org; Liu, Hongtao > ; ubiz...@gmail.com; Hu, Lin1 > Subject: Re: [PATCH 2/8] i386: Optim

RE: [PATCH 2/8] i386: Optimize ordered and nonequal

2024-09-03 Thread Hu, Lin1

> -Original Message- > From: Hu, Lin1 > Sent: Tuesday, September 3, 2024 2:05 PM > To: Jakub Jelinek ; Andrew Pinski ; > Liu, Hongtao > Cc: Jiang, Haochen ; Richard Biener > ; gcc-patches@gcc.gnu.org; ubiz...@gmail.com > Subject: RE: [PATCH 2/8] i386: Optim

[PATCH] Match: Fix ordered and nonequal

2024-09-03 Thread Hu, Lin1

Hi, all This patch is a fix patch. Need to add :c for bit_and, because bit_and is commutative. And is (ltgt @0 @1) is simpler than (bit_not (uneq @0 @1)). Bootstrapped/regtested on x86-64-pc-linux-gnu, OK for trunk? BRs, Lin gcc/ChangeLog: * match.pd: Fix match for (bit_and (ordered @

RE: [PATCH] Match: Fix ordered and nonequal

2024-09-04 Thread Hu, Lin1

Type wrong hongtao's e-mail address. > -Original Message- > From: Hu, Lin1 > Sent: Wednesday, September 4, 2024 1:44 PM > To: gcc-patches@gcc.gnu.org > Cc: hontao@intel.com; ubiz...@gmail.com; rguent...@suse.de; > ja...@redhat.com; pins...@gmail.com > S

[PATCH] testsuite: Fix xorsign.c, vect-double-2.c fails with -march=x86-64-v2

2024-09-05 Thread Hu, Lin1

Hi, all These testcases raise fails with -march=x86-64-v2, so add -mno-sse4 to avoid these unexpected fails. Bootstrap and regtest running on x86-64-linux-gnu, pushed as obvious. BRs, Lin gcc/testsuite/ChangeLog: PR testsuite/116608 * gcc.target/i386/vect-double-2.c: Add extra

[PATCH] i386: Fix some patterns's mem attribute.

2024-10-09 Thread Hu, Lin1

Hi, all This is another patch to modify some pattern's type attr from ssemov to ssemov2. Some ssemov pattern's mem attr should be load when their 2 operand is a memory operand. Bootstrapped and regtested on x86-64-linux-pc, OK for trunk? BRs, Lin gcc/ChangeLog: * config/i386/sse.md

RE: [PATCH v2] i386: Handling exception input of __builtin_ia32_prefetch. [PR117416]

2024-11-05 Thread Hu, Lin1

> -Original Message- > From: Hu, Lin1 > Sent: Tuesday, November 5, 2024 1:34 PM > To: gcc-patches@gcc.gnu.org > Cc: Liu, Hongtao ; ubiz...@gmail.com > Subject: [PATCH v2] i386: Handling exception input of > __builtin_ia32_prefetch. [PR117416] > > Add handler

[PATCH] i386: Handling exception input of __builtin_ia32_prefetch. [PR117416]

2024-11-04 Thread Hu, Lin1

Hi, all __builtin_ia32_prefetch's op1 should be between 0 and 2. So add an error handler. Bootstrapped and regtested on x86_64-pc-linux-gnu, there is a unrelated FAIL that has yet to be found root cause, just send patch for review. BRs, Lin gcc/ChangeLog: PR target/117416 * co

[PATCH v2] i386: Handling exception input of __builtin_ia32_prefetch. [PR117416]

2024-11-04 Thread Hu, Lin1

Add handler for op3, and the previously stated fail is a random fail not related to this change, OK for trunk? op1 should be between 0 and 2. Add an error handler, and op3 should be 0 or 1, raise a warning, when op3 is an invalid value. gcc/ChangeLog: PR target/117416 * config/i3

[PATCH] i386: Add OPTION_MASK_ISA2_EVEX512 for some AVX512 instructions.

2024-11-05 Thread Hu, Lin1

Hi, all This patch aims to add OPTION_MASK_ISA2_EVEX512 for all avx512 512-bits builtin functions, raise error when these builtin functions are used with -mno-evex512. Bootstrapped and Regtested on x86-64-pc-linux-gnu, OK for trunk and backport to GCC14? BRs, Lin gcc/ChangeLog: PR targ

[PATCH v3] i386: Zero extend 32-bit address to 64-bit with option -mx32 -maddress-mode=long. [PR 117418]

2024-11-10 Thread Hu, Lin1

OK, added check for target. Bootstrapped and Regtested on x86-64-linux-pc-gnu, OK for trunk? BRs, Lin -maddress-mode=long let Pmode = DI_mode, so zero extend 32-bit address to 64-bit and uses a 64-bit register as a pointer for avoid raise an ICE. gcc/ChangeLog: PR target/117418

[PATCH] i386: Add ssemov2, sseicvt2 for some load instructions that use memory on operand2

2024-09-18 Thread Hu, Lin1

Hi, all The memory attr of some instructions should be 'load', but these is 'none' currently. This patch add two new types ssemov2, sseicvt2 for some load instructions that use memory on operands. So their memory attr will be 'load'. Bootstrapped and Regtested on x86-64-pc-linux-gnu, OK for trun

[PATCH] i386: Add -mavx512vl for pr117304-1.c

2024-11-06 Thread Hu, Lin1

Hi, all Testing pr117304-1.c in a machine with only avx2 generates some different hints, so add -mavx512vl at its option list. Bootstrapped and regtested on x86-64-pc-linux-gnu. I think it is an obvious commit, but I still waiting for some while. If someone have other suggestion. BRs, Lin gcc/

RE: [PATCH] i386: Add -mavx512vl for pr117304-1.c

2024-11-06 Thread Hu, Lin1

> -Original Message- > From: Liu, Hongtao > Sent: Thursday, November 7, 2024 11:41 AM > To: Hu, Lin1 ; gcc-patches@gcc.gnu.org > Cc: ubiz...@gmail.com > Subject: RE: [PATCH] i386: Add -mavx512vl for pr117304-1.c > > > > > -Original Message-

[PATCH] i386: Modify regexp of pr117304-1.c

2024-11-06 Thread Hu, Lin1

OK, so just modify the regexp. Since the test doesn't care if the hint is correct, modify the regexp of the hint part to avoid future changes to the hint that would cause the test to fail. BRs, Lin gcc/testsuite/ChangeLog: * gcc.target/i386/pr117304-1.c: Modify regexp. --- gcc/testsuit

[PATCH] i386: Disallow long address mode in the x32 mode. [PR 117418]

2024-11-07 Thread Hu, Lin1

Hi, all -maddress-mode=long will let Pmode = DI_mode, but -mx32 request x32 ABI. So raise an error to avoid ICE. Bootstrapped and regtested, OK for trunk? BRs, Lin gcc/ChangeLog: PR target/117418 * config/i386/i386-options.cc (ix86_option_override_internal): raise an er

[PATCH v2] i386: Zero extend 32-bit address to 64-bit with option -mx32 -maddress-mode=long. [PR 117418]

2024-11-07 Thread Hu, Lin1

Thanks for your suggestions and answer. This is the current version. There is no problem in my test environment, but also in the further testing, sent for review. BRs, Lin -maddress-mode=long let Pmode = DI_mode, so zero extend 32-bit address to 64-bit and uses a 64-bit register as a pointer for

[PATCH] i386: Fix AVX10.2 sat cvt intrinsic.

2025-03-24 Thread Hu, Lin1

Hi, all The patch aims to modify the missed fixed for vcvttph2iubs's testcase. Bootstrapped and tested on x86_64-linux-gnu{-m32,-m64}. Commited as obvious change like the previous approved fix patch. BRs, Lin gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-vcvttph2iubs-2.c: Mod

[PATCH v2] i386: Add "s_" as Saturation for AVX10.2 Converting Intrinsics.

2025-03-25 Thread Hu, Lin1

Modify ChangeLog. This patch aims to add "s_" after 'cvt' represent saturation. gcc/ChangeLog: * config/i386/avx10_2-512convertintrin.h (_mm512_mask_cvtx2ps_ph): Formatting fixes (_mm512_mask_cvtx_round2ps_ph): Ditto (_mm512_maskz_cvtx_round2ps_ph): Ditto (_mm512

[PATCH] i386: Add "s_" as Saturation for AVX10.2 Converting Intrinsics.

2025-03-25 Thread Hu, Lin1

Hi, all This patch aims to add "s_" after 'cvt' represent saturation. Bootstrapped and regtested on x86_64-linux-gnu-{-m32,-m64}, OK for trunk? BRs, Lin gcc/ChangeLog: * config/i386/avx10_2-512convertintrin.h: Modify intrin name. * config/i386/avx10_2convertintrin.h: Ditto. gc

RE: [PATCH 0/4] Fix AVX10.2 SAT CVT.

2025-03-19 Thread Hu, Lin1

Thanks, forgot to mention, these patches are bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,-m64}. BRs, Lin > -Original Message- > From: Liu, Hongtao > Sent: Thursday, March 20, 2025 9:43 AM > To: Hu, Lin1 ; gcc-patches@gcc.gnu.org > Cc: ubiz...@gmail.com > S

[PATCH] i386: Fix AVX10.2 SAT CVT testcases.

2025-03-20 Thread Hu, Lin1

Hi, res_ref will be modified after MASK_ZERO, init res_ref2 for rounding control intrinsics. Bootstrapped and regtested on x86-64-pc-linux-gnu{-m32,-m64}, OK for trunk? BRs, Lin gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-vcvtph2ibs-2.c: Fix testcase. * gcc.target/i3

[PATCH 2/4] i386: Add AVX10.2 SAT CVT Intrinsics without Rounding Control

2025-03-19 Thread Hu, Lin1

gcc/ChangeLog: * config/i386/avx10_2-512satcvtintrin.h: Add new intrinsics. * config/i386/avx10_2satcvtintrin.h: Ditto. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE (V32HI, V32HF, V32HI, USI), (V16SI, V16SF, V16SI, UHI), (V8DI, V8SF, V8DI, UQI

[PATCH] i386: Set attr "addr" as "gpr16" for constraint "jm". [PR 119425]

2025-03-25 Thread Hu, Lin1

Hi, all This patch aims to ensure each alternative with constraint "jm" should set addr "gpr16", otherwise maybe raise ICE in reload pass. Bootstrapped and Regtested for x86_64-pc-linux-gnu{-m32,-m64}, ok for trunk? BRs, Lin gcc/ChangeLog: PR target/119425 * config/i386/sse.md:

RE: [PATCH] i386: Add attr_isa for vaes patterns to sync with attr gpr16. [pr119473]

2025-03-27 Thread Hu, Lin1

Bootstrapped and Regtested on x86_64-linux-gnu{-m32,-m64}, OK for trunk? BRs, Lin > -Original Message- > From: Hu, Lin1 > Sent: Friday, March 28, 2025 1:55 PM > To: gcc-patches@gcc.gnu.org > Cc: Liu, Hongtao ; ubiz...@gmail.com; Wang, Hongyu > > Subject: [PATCH] i

[PATCH] i386: Add attr_isa for vaes patterns to sync with attr gpr16. [pr119473]

2025-03-27 Thread Hu, Lin1

For vaes patterns with jm constraint and gpr16 attr, it requires "isa" attr to distinct avx/avx512 alternatives in ix86_memory_address_reg_class. Also adds missing type and mode attributes for those vaes patterns. gcc/ChangeLog: PR target/119473 * config/i386/sse.md (vaesd

[PATCH 3/4] i386: Fix AVX10.2 SAT CVT testcases.

2025-03-19 Thread Hu, Lin1

Add missing testcases. gcc/testsuite/ChangeLog: * gcc.target/i386/avx10_2-512-satcvt-1.c: Add testcase. * gcc.target/i386/avx10_2-512-vcvtbf162ibs-2.c: Ditto * gcc.target/i386/avx10_2-512-vcvtbf162iubs-2.c: Ditto * gcc.target/i386/avx10_2-512-vcvtph2ibs-2.c: Ditto

[PATCH 0/4] Fix AVX10.2 SAT CVT.

2025-03-19 Thread Hu, Lin1

Hi, all This series of patches fixes three issues in AVX10.2 SAT CVT: 1. Adds ep[i|u]8 suffix to *[i|u]bs intrinsic names. 2. Introduces SAT CVT intrinsics without rounding control. 3. Marks saturation by adding 's_' before core name. BRs, Lin Hu, Lin1 (4): i386: Update Suffix f

[PATCH 1/4] i386: Update Suffix for AVX10.2 SAT CVT Intrinsics

2025-03-19 Thread Hu, Lin1

The intrinsic names for *[i|u]bs instructions in AVX10.2 are missing the required _ep[i|u]8 suffix. This patch aims to fix the issue. gcc/ChangeLog: * config/i386/avx10_2-512satcvtintrin.h: Change *i[u]bs's type suffix of intrin name. * config/i386/avx10_2satcvtintrin.h:

RE: [PATCH v2] i386: Add "s_" as Saturation for AVX10.2 Converting Intrinsics.

2025-03-25 Thread Hu, Lin1

More details: Alignment with llvm (https://github.com/llvm/llvm-project/pull/131592) BRs, Lin > -Original Message- > From: Hu, Lin1 > Sent: Tuesday, March 25, 2025 4:10 PM > To: gcc-patches@gcc.gnu.org > Cc: Liu, Hongtao ; ubiz...@gmail.com > Subject: [PATCH v2]

[PATCH] i386: Add more forms peephole2 for adc/sbb

2025-05-26 Thread Hu, Lin1

Hi, all Enable -mapxf will change some patterns about adc/sbb. Hence gcc will raise an extra mov like movq8(%rdi), %rax adcq%rax, 8(%rsi), %rax movq%rax, 8(%rdi) rather than movq8(%rsi), %rax adcq%rax, 8(%rdi) The patch add more ki

[PATCH] i386: Add more peephole2 for APX NDD

2025-05-29 Thread Hu, Lin1

Hi, The patch aims to optimize movb(%rdi), %al movq%rdi, %rbx xorl%esi, %eax, %edx movb%dl, (%rdi) cmpb%sil, %al jne to xorb%sil, (%rdi) movq%rdi, %rbx jne Reduce 2 mov and 1 cmp instructi

[PATCH] i386: Fix vmovvdup's mem attribute

2025-06-04 Thread Hu, Lin1

Hi, Some vmovvdup pattern's type attribute is sselog1 and then mem attribute is both. Modify type attribute according to other patterns about vmovvdup. Bootstrapped and regtested on x86_64-linux-pc-gnu, OK for trunk? BRs, Lin gcc/ChangeLog: * config/i386/sse.md (avx512f_movddup

[PATCH] i386: Add a new peeophole2 for PR91384 under APX_F

2025-06-04 Thread Hu, Lin1

gcc/ChangeLog: PR target/91384 * config/i386/i386.md: Add new peeophole2 for optimize *negsi_1 followed by *cmpsi_ccno_1 with APX_F. gcc/testsuite/ChangeLog: PR target/91384 * gcc.target/i386/pr91384-1.c: New test. --- gcc/config/i386/i386.md

[PATCH] i386: Set SRF, GRR, CWF, GNR, DMR, ARL and PTL issue rate

2025-06-11 Thread Hu, Lin1

Hi, This patch aims to set SRF issue rate to 4, GNR issue rate to 6. According to tests about spec2017, the patch has little effect on performance. For GRR, CWF, DMR, ARL and PTL, the patch set their issue rate to 6. Waiting for more information to update. Bootstrapped and regtested on x86_64-li

[PATCH] Add myself for write after approval

2023-07-30 Thread Hu, Lin1 via Gcc-patches

ChangeLog: * MAINTAINERS (Write After Approval): Add myself. --- MAINTAINERS | 1 + 1 file changed, 1 insertion(+) diff --git a/MAINTAINERS b/MAINTAINERS index 49aa6bae73b..90e2c81f0c2 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -460,6 +460,7 @@ Matthew Hiller

[PATCH] i386: refactor macros.

2023-06-28 Thread Hu, Lin1 via Gcc-patches

Hi, all This patch aims to refactor macros in case some other thing is added to AMX_TILE_SET in future. OK for trunk? BRs, Lin gcc/ChangeLog: * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_AMX_INT8_SET): Change OPTION_MASK_ISA2_AMX_TILE to OPTION_MASK_ISA2_AMX_TILE_SET.

[PATCH] i386:Add missing OPTION_MASK_ISA_AVX512VL in i386-builtin.def for VAES builtins

2023-03-13 Thread Hu, Lin1 via Gcc-patches

The implementation of these builtins requires support for both AVX512VL and VAES. However, the builtins didn't request AVX512VL. As a result, compiling pr109117-1.c with the options -mvaes -mno-avx512vl caused an ICE. This patch aims to fix the bug. gcc/ChangeLog: PR target/109117

RE: [PATCH] i386:Add missing OPTION_MASK_ISA_AVX512VL in i386-builtin.def for VAES builtins

2023-03-14 Thread Hu, Lin1 via Gcc-patches

It has regtested on x86_64-pc-linux-gnu. OK for trunk? Thanks. Lin -Original Message- From: Uros Bizjak Sent: Tuesday, March 14, 2023 3:05 PM To: Hu, Lin1 Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao Subject: Re: [PATCH] i386:Add missing OPTION_MASK_ISA_AVX512VL in i386-builtin.def for

[PATCH] i386: Fix incorrect intrinsic signature for AVX512 s{lli|rai|rli}

2023-05-24 Thread Hu, Lin1 via Gcc-patches

Hi all, This patch aims to fix incorrect intrinsic signature for _mm{512|256|}_s{lli|rai|rli}_epi*. And it has been tested on x86_64-pc-linux-gnu. OK for trunk? BRs, Lin gcc/ChangeLog: PR target/109173 PR target/109174 * config/i386/avx512bwintrin.h (_mm512_srli_epi16)

RE: [PATCH] i386: Fix incorrect intrinsic signature for AVX512 s{lli|rai|rli}

2023-05-25 Thread Hu, Lin1 via Gcc-patches

OK, I update the change log and modify a part of format. The attached file is the new version. -Original Message- From: Hongtao Liu Sent: Thursday, May 25, 2023 11:40 AM To: Hu, Lin1 Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao ; ubiz...@gmail.com Subject: Re: [PATCH] i386: Fix

[PATCH] i386: Optimize code generation of __mm256_zextsi128_si256(__mm_set1_epi8(-1))

2022-09-22 Thread Hu, Lin1 via Gcc-patches

Hi all, This patch aims to optimize code generation of __mm256_zextsi128_si256(__mm_set1_epi8(-1)). Reduce the number of instructions required to achieve the final result. Regtested on x86_64-pc-linux-gnu. Ok for trunk? BRs, Lin gcc/ChangeLog: PR target/94962 * config/i386/co

RE: [PATCH] i386: Optimize code generation of __mm256_zextsi128_si256(__mm_set1_epi8(-1))

2022-09-22 Thread Hu, Lin1 via Gcc-patches

Hi, Hongtao I have modefied this patch and regtested on x86_64-pc-linux-gnu. BRs. Lin -Original Message- From: Hongtao Liu Sent: Friday, September 23, 2022 9:48 AM To: Hu, Lin1 Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao Subject: Re: [PATCH] i386: Optimize code generation of

[PATCH] testsuite: Fix up avx256-unaligned-store-3.c test.

2022-09-25 Thread Hu, Lin1 via Gcc-patches

Hi all, This patch aims to fix a problem that avx256-unaligned-store-3.c test reports two unexpected fails under "-march=cascadelake". Regtested on x86_64-pc-linux-gnu. Ok for trunk? BRs, Lin gcc/testsuite/ChangeLog: PR target/94962 * gcc.target/i386/avx256-unaligned-store-3.c

[PATCH 1/4] i386: Remove Meteorlake's family_model

2023-01-03 Thread Hu, Lin1 via Gcc-patches

Hi all, This patch aims to modified meteorlake's family_model. Regtested on x86_64-pc-linux-gnu. Ok for trunk? BRs, Lin gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_intel_cpu): Remove case 0xb5 for meteorlake. --- gcc/common/config/i386/cpuinfo.h | 1 - 1 file changed, 1

[PATCH 2/4] Initial Emeraldrapids Support

2023-01-03 Thread Hu, Lin1 via Gcc-patches

gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_intel_cpu): Handle Emeraldrapids. * common/config/i386/i386-common.cc: Add Emeraldrapids. --- gcc/common/config/i386/cpuinfo.h | 2 ++ gcc/common/config/i386/i386-common.cc | 2 ++ 2 files changed, 4 insertions(+) diff --git

RE: [PATCH 2/4] Initial Emeraldrapids Support

2023-01-03 Thread Hu, Lin1 via Gcc-patches

"PATCH 2 Initial Emeraldrapids Support" aims to support Emeraldrapids for GCC. It's my mistake, resulting in the omission of its information. -Original Message- From: Liu, Hongtao Sent: Tuesday, January 3, 2023 4:48 PM To: Hu, Lin1 ; gcc-patches@gcc.gnu.org Cc: ub

1 2 >

1 - 100 of 105 matches

Mail list logo