[gcc r15-167] Update libbid according to the latest Intel Decimal Floating-Point Math Library.

2024-05-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:affd77d3fe7bfb525b3fb23316d164e847ed02d1 commit r15-167-gaffd77d3fe7bfb525b3fb23316d164e847ed02d1 Author: liuhongt Date: Wed Mar 27 08:20:13 2024 +0800 Update libbid according to the latest Intel Decimal Floating-Point Math Library. The Intel Decimal Floatin

[gcc r15-235] Support dot_prod optabs for 64-bit vector.

2024-05-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:fa911365490a7ca308878517a4af6189ffba7ed6 commit r15-235-gfa911365490a7ca308878517a4af6189ffba7ed6 Author: liuhongt Date: Wed Dec 20 11:43:25 2023 +0800 Support dot_prod optabs for 64-bit vector. gcc/ChangeLog: PR target/113079 * c

[gcc r15-236] Extend usdot_prodv*qi with vpmaddwd when AVXVNNI/AVX512VNNI is not available.

2024-05-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:8b974f54393ab2d2d16a0051a68c155455a92aad commit r15-236-g8b974f54393ab2d2d16a0051a68c155455a92aad Author: liuhongt Date: Mon Jan 8 15:13:41 2024 +0800 Extend usdot_prodv*qi with vpmaddwd when AVXVNNI/AVX512VNNI is not available. gcc/ChangeLog:

[gcc r15-234] Optimize 64-bit vector permutation with punpcklqdq + 128-bit vector pshuf.

2024-05-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a9f642783853b60bb0a59562b8ab3ed10ec01641 commit r15-234-ga9f642783853b60bb0a59562b8ab3ed10ec01641 Author: liuhongt Date: Wed Dec 20 11:54:43 2023 +0800 Optimize 64-bit vector permutation with punpcklqdq + 128-bit vector pshuf. gcc/ChangeLog:

[gcc r15-499] x86: Add 3-instruction subroutine vector shift for V16QI in ix86_expand_vec_perm_const_1 [PR107563]

2024-05-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a71f90c5a7ae2942083921033cb23dcd63e70525 commit r15-499-ga71f90c5a7ae2942083921033cb23dcd63e70525 Author: Levy Hsu Date: Thu May 9 16:50:56 2024 +0800 x86: Add 3-instruction subroutine vector shift for V16QI in ix86_expand_vec_perm_const_1 [PR107563] Hi All

[gcc r15-529] Optimize ashift >> 7 to vpcmpgtb for vector int8.

2024-05-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:0cc0956b3bb8bcbc9196075b9073a227d799e042 commit r15-529-g0cc0956b3bb8bcbc9196075b9073a227d799e042 Author: liuhongt Date: Tue May 14 18:39:54 2024 +0800 Optimize ashift >> 7 to vpcmpgtb for vector int8. Since there is no corresponding instruction, the shift op

[gcc r15-530] Set d.one_operand_p to true when TARGET_SSSE3 in ix86_expand_vecop_qihi_partial.

2024-05-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:090714e6cf8029f4ff8883dce687200024adbaeb commit r15-530-g090714e6cf8029f4ff8883dce687200024adbaeb Author: liuhongt Date: Wed May 15 10:56:24 2024 +0800 Set d.one_operand_p to true when TARGET_SSSE3 in ix86_expand_vecop_qihi_partial. pshufb is available under

[gcc r15-717] Use pblendw instead of pand to clear upper 16 bits.

2024-05-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:0ebaffccb294d90184ad78367de66b6307de3ac0 commit r15-717-g0ebaffccb294d90184ad78367de66b6307de3ac0 Author: liuhongt Date: Fri Mar 22 14:40:00 2024 +0800 Use pblendw instead of pand to clear upper 16 bits. For vec_pack_truncv8si/v4si w/o AVX512, (const_vect

[gcc r15-3058] Align predicates for operands[1] between mov and *mov_internal.

2024-08-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:bb42c551905024ea23095a0eb7b58fdbcfbcaef6 commit r15-3058-gbb42c551905024ea23095a0eb7b58fdbcfbcaef6 Author: liuhongt Date: Tue Aug 20 14:41:00 2024 +0800 Align predicates for operands[1] between mov and *mov_internal. > It's not obvious to me why movv16qi req

[gcc r15-3078] Align ix86_{move_max,store_max} with vectorizer.

2024-08-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:6ea25c041964bf63014fcf7bb68fb1f5a0a4e123 commit r15-3078-g6ea25c041964bf63014fcf7bb68fb1f5a0a4e123 Author: liuhongt Date: Thu Aug 15 12:54:07 2024 +0800 Align ix86_{move_max,store_max} with vectorizer. When none of mprefer-vector-width, avx256_optimal/avx128_

[gcc r14-10608] Align ix86_{move_max,store_max} with vectorizer.

2024-08-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:27dc1533b6dfc49f3912c524db51d6c372a5ac3d commit r14-10608-g27dc1533b6dfc49f3912c524db51d6c372a5ac3d Author: liuhongt Date: Thu Aug 15 12:54:07 2024 +0800 Align ix86_{move_max,store_max} with vectorizer. When none of mprefer-vector-width, avx256_optimal/avx128

[gcc r13-8987] Align ix86_{move_max,store_max} with vectorizer.

2024-08-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:aea374238cec1a1e53fb79575d2f998e16926999 commit r13-8987-gaea374238cec1a1e53fb79575d2f998e16926999 Author: liuhongt Date: Thu Aug 15 12:54:07 2024 +0800 Align ix86_{move_max,store_max} with vectorizer. When none of mprefer-vector-width, avx256_optimal/avx128_

[gcc r12-10682] Align ix86_{move_max,store_max} with vectorizer.

2024-08-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b4bc34db3f2948e37ad55a09870635e88c54c7d3 commit r12-10682-gb4bc34db3f2948e37ad55a09870635e88c54c7d3 Author: liuhongt Date: Thu Aug 15 12:54:07 2024 +0800 Align ix86_{move_max,store_max} with vectorizer. When none of mprefer-vector-width, avx256_optimal/avx128

[gcc r13-8988] Fix testcase failure.

2024-08-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ea9c508927ec032c6d67a24df59ffa429e4d3d95 commit r13-8988-gea9c508927ec032c6d67a24df59ffa429e4d3d95 Author: liuhongt Date: Thu Aug 22 14:31:40 2024 +0800 Fix testcase failure. gcc/testsuite/ChangeLog: * gcc.target/i386/pieces-memcpy-10.c: Use

[gcc r12-10683] Fix testcase failure.

2024-08-22 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:141d8aa375ea32c05f0d437828e6a76f1a3ea4af commit r12-10683-g141d8aa375ea32c05f0d437828e6a76f1a3ea4af Author: liuhongt Date: Thu Aug 22 14:31:40 2024 +0800 Fix testcase failure. gcc/testsuite/ChangeLog: * gcc.target/i386/pieces-memcpy-10.c: Use

[gcc r15-3314] Check avx upper register for parallel.

2024-08-29 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ab214ef734bfc3dcffcf79ff9e1dd651c2b40566 commit r15-3314-gab214ef734bfc3dcffcf79ff9e1dd651c2b40566 Author: liuhongt Date: Thu Aug 29 11:39:20 2024 +0800 Check avx upper register for parallel. For function arguments/return, when it's BLK mode, it's put in a

[gcc r14-10625] Check avx upper register for parallel.

2024-09-01 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ba9a3f105ea552a22d08f2d54dfdbef16af7c99e commit r14-10625-gba9a3f105ea552a22d08f2d54dfdbef16af7c99e Author: liuhongt Date: Thu Aug 29 11:39:20 2024 +0800 Check avx upper register for parallel. For function arguments/return, when it's BLK mode, it's put in a

[gcc r13-8999] Check avx upper register for parallel.

2024-09-01 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:5e049ada87842947adaca5c607516396889f64d6 commit r13-8999-g5e049ada87842947adaca5c607516396889f64d6 Author: liuhongt Date: Thu Aug 29 11:39:20 2024 +0800 Check avx upper register for parallel. For function arguments/return, when it's BLK mode, it's put in a

[gcc r12-10694] Check avx upper register for parallel.

2024-09-01 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:6585b06303d8fd9da907f443fc0da9faed303712 commit r12-10694-g6585b06303d8fd9da907f443fc0da9faed303712 Author: liuhongt Date: Thu Aug 29 11:39:20 2024 +0800 Check avx upper register for parallel. For function arguments/return, when it's BLK mode, it's put in a

[gcc r15-3498] Handle const0_operand for *avx2_pcmp3_1.

2024-09-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a51f2fc0d80869ab079a93cc3858f24a1fd28237 commit r15-3498-ga51f2fc0d80869ab079a93cc3858f24a1fd28237 Author: liuhongt Date: Wed Sep 4 15:39:17 2024 +0800 Handle const0_operand for *avx2_pcmp3_1. *_eq3_1 supports nonimm_or_0_operand for op1 and op2, pass_com

[gcc r15-3558] Don't force_reg operands[3] when it's not const0_rtx.

2024-09-09 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:c726a6643125a59e2ba6f992924a2d0098104578 commit r15-3558-gc726a6643125a59e2ba6f992924a2d0098104578 Author: liuhongt Date: Fri Sep 6 15:03:16 2024 +0800 Don't force_reg operands[3] when it's not const0_rtx. It fix the regression by a51f2fc0d80869ab079

[gcc r15-3579] Enable tune fuse_move_and_alu for GNR.

2024-09-10 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:f80e4ba94e41410219bdcdb1a0f204ea3f148666 commit r15-3579-gf80e4ba94e41410219bdcdb1a0f204ea3f148666 Author: liuhongt Date: Tue Sep 10 15:04:58 2024 +0800 Enable tune fuse_move_and_alu for GNR. According to Intel Software Optimization Manual[1], the Redwood cov

[gcc r15-1638] Optimize a < 0 ? -1 : 0 to (signed)a >> 31.

2024-06-25 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:aac00d09859cc5934bd0f7493d537b8430337773 commit r15-1638-gaac00d09859cc5934bd0f7493d537b8430337773 Author: liuhongt Date: Thu Jun 20 12:41:13 2024 +0800 Optimize a < 0 ? -1 : 0 to (signed)a >> 31. Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31 and x

[gcc r15-1673] Fix wrong cost of MEM when addr is a lea.

2024-06-26 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b8153b5417bed02f47354a14ad36100785dfdc47 commit r15-1673-gb8153b5417bed02f47354a14ad36100785dfdc47 Author: liuhongt Date: Mon Jun 24 17:53:22 2024 +0800 Fix wrong cost of MEM when addr is a lea. 416.gamess regressed 4-6% on x86_64 since my r15-882-g1d6199e5f8

[gcc r15-1733] Define mask as extern instead of uninitialized local variables.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:5e1a9f4ccff390ae79a9b9d0d39b325f2b4ea925 commit r15-1733-g5e1a9f4ccff390ae79a9b9d0d39b325f2b4ea925 Author: liuhongt Date: Wed Jun 26 11:17:46 2024 +0800 Define mask as extern instead of uninitialized local variables. The testcases are supposed to scan for vpo

[gcc r15-1734] Extend lshifrtsi3_1_zext to ?k alternative.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:8e1fa107a63b2e160b6bf69de4fe163dd3cebd80 commit r15-1734-g8e1fa107a63b2e160b6bf69de4fe163dd3cebd80 Author: liuhongt Date: Wed Jun 26 13:07:31 2024 +0800 Extend lshifrtsi3_1_zext to ?k alternative. late_combine will combine lshift + zero into *lshifrtsi3_1_zex

[gcc r15-1735] Enable flate-combine.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e62ea4fb8ffcab06ddd02f26db91b29b7270743f commit r15-1735-ge62ea4fb8ffcab06ddd02f26db91b29b7270743f Author: liuhongt Date: Wed Jun 26 13:52:24 2024 +0800 Enable flate-combine. Move pass_stv2 and pass_rpad after pre_reload pass_late_combine, also define tar

[gcc r15-1736] Add more splitters to match (unspec [op1 op2 (gt op3 constm1_operand)] UNSPEC_BLENDV)

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:2e2dfa0095c3326a0a5fc2ff175918b42eeb044f commit r15-1736-g2e2dfa0095c3326a0a5fc2ff175918b42eeb044f Author: liuhongt Date: Mon Jun 17 17:16:46 2024 +0800 Add more splitters to match (unspec [op1 op2 (gt op3 constm1_operand)] UNSPEC_BLENDV) These define_insn_a

[gcc r15-1737] Lower AVX512 kmask comparison back to AVX2 comparison when op_{true, false} is vector -1/0.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b06a108f0fbffe12493b527224f6e4131a72beac commit r15-1737-gb06a108f0fbffe12493b527224f6e4131a72beac Author: liuhongt Date: Tue Jun 18 14:03:42 2024 +0800 Lower AVX512 kmask comparison back to AVX2 comparison when op_{true,false} is vector -1/0. gcc/ChangeLog

[gcc r15-1739] Add more splitter for mskmov with avx512 comparison.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:3cb204046c0db899750aee9480af4f1953a40ac3 commit r15-1739-g3cb204046c0db899750aee9480af4f1953a40ac3 Author: liuhongt Date: Wed Jun 19 13:12:00 2024 +0800 Add more splitter for mskmov with avx512 comparison. gcc/ChangeLog: PR target/115517

[gcc r15-1740] Adjust testcase for the regressed testcases after obsolete of vcond{, u, eq}.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e94e6ee495d95f29355bbc017214228a5e367638 commit r15-1740-ge94e6ee495d95f29355bbc017214228a5e367638 Author: liuhongt Date: Wed Jun 19 16:05:58 2024 +0800 Adjust testcase for the regressed testcases after obsolete of vcond{,u,eq}. > Richard suggests that we imp

[gcc r15-1738] Match IEEE min/max with UNSPEC_IEEE_{MIN,MAX}.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:09737d9605521df9232d9990006c44955064f44e commit r15-1738-g09737d9605521df9232d9990006c44955064f44e Author: liuhongt Date: Tue Jun 18 15:52:02 2024 +0800 Match IEEE min/max with UNSPEC_IEEE_{MIN,MAX}. These versions of the min/max patterns implement exactly th

[gcc r15-1741] Optimize a < 0 ? -1 : 0 to (signed)a >> 31.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:2ccdd0f22312a14ac64bf944fdc4f8e7532eb0eb commit r15-1741-g2ccdd0f22312a14ac64bf944fdc4f8e7532eb0eb Author: liuhongt Date: Thu Jun 20 12:41:13 2024 +0800 Optimize a < 0 ? -1 : 0 to (signed)a >> 31. Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31 and x

[gcc r15-1742] Remove vcond{, u, eq} expanders since they will be obsolete.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:55f80c690c5fa59836646565a9dee2a3f68374a0 commit r15-1742-g55f80c690c5fa59836646565a9dee2a3f68374a0 Author: liuhongt Date: Mon Jun 24 09:19:01 2024 +0800 Remove vcond{,u,eq} expanders since they will be obsolete. gcc/ChangeLog: PR target/11551

[gcc r15-1806] Move runtime check into a separate function and guard it with target ("no-avx")

2024-07-03 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:239ad907b1fc08874042f8bea5f61eaf3ba2877d commit r15-1806-g239ad907b1fc08874042f8bea5f61eaf3ba2877d Author: liuhongt Date: Wed Jul 3 14:47:33 2024 +0800 Move runtime check into a separate function and guard it with target ("no-avx") The patch can avoid SIGILL

[gcc r15-1836] Use __builtin_cpu_support instead of __get_cpuid_count.

2024-07-03 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:699087a16591adfdf21228876b6c48dbcd353faa commit r15-1836-g699087a16591adfdf21228876b6c48dbcd353faa Author: liuhongt Date: Thu Jul 4 13:57:32 2024 +0800 Use __builtin_cpu_support instead of __get_cpuid_count. gcc/testsuite/ChangeLog: PR target

[gcc r15-1888] x86: Update branch hint for Redwood Cove.

2024-07-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a910c30c7c27cd0f6d2d2694544a09fb11d611b9 commit r15-1888-ga910c30c7c27cd0f6d2d2694544a09fb11d611b9 Author: H.J. Lu Date: Tue Apr 26 11:08:55 2022 -0700 x86: Update branch hint for Redwood Cove. According to IntelĀ® 64 and IA-32 Architectures Optimization Refer

[gcc r15-1905] Rename __{float, double}_u to __x86_{float, double}_u to avoid pulluting the namespace.

2024-07-08 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:23ab7f632f4f5bae67fb53cf7b18fea7ba7242c4 commit r15-1905-g23ab7f632f4f5bae67fb53cf7b18fea7ba7242c4 Author: liuhongt Date: Mon Jul 8 10:35:35 2024 +0800 Rename __{float,double}_u to __x86_{float,double}_u to avoid pulluting the namespace. I have a build failu

[gcc r12-10617] Fix SSA_NAME leak due to def_stmt is removed before use_stmt.

2024-07-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e1427b39d28f382d21e7a0ea1714b3250e0a6e5d commit r12-10617-ge1427b39d28f382d21e7a0ea1714b3250e0a6e5d Author: liuhongt Date: Fri Jul 12 09:39:23 2024 +0800 Fix SSA_NAME leak due to def_stmt is removed before use_stmt. - _5 = __atomic_fetch_or_8 (&set_work_pend

[gcc r13-8913] Fix SSA_NAME leak due to def_stmt is removed before use_stmt.

2024-07-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:9a1cdaa5e8441394d613f5f3401e7aab21efe8f0 commit r13-8913-g9a1cdaa5e8441394d613f5f3401e7aab21efe8f0 Author: liuhongt Date: Fri Jul 12 09:39:23 2024 +0800 Fix SSA_NAME leak due to def_stmt is removed before use_stmt. - _5 = __atomic_fetch_or_8 (&set_work_pendi

[gcc r14-10422] Fix SSA_NAME leak due to def_stmt is removed before use_stmt.

2024-07-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:13bfc385b0baebd22aeabb0d90915f2e9b18febe commit r14-10422-g13bfc385b0baebd22aeabb0d90915f2e9b18febe Author: liuhongt Date: Fri Jul 12 09:39:23 2024 +0800 Fix SSA_NAME leak due to def_stmt is removed before use_stmt. - _5 = __atomic_fetch_or_8 (&set_work_pend

[gcc r15-2038] Fix SSA_NAME leak due to def_stmt is removed before use_stmt.

2024-07-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:f27bf48e0204524ead795fe618cd8b1224f72fd4 commit r15-2038-gf27bf48e0204524ead795fe618cd8b1224f72fd4 Author: liuhongt Date: Fri Jul 12 09:39:23 2024 +0800 Fix SSA_NAME leak due to def_stmt is removed before use_stmt. - _5 = __atomic_fetch_or_8 (&set_work_pendi

[gcc r14-10425] x86: Update branch hint for Redwood Cove.

2024-07-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:1fff665a51e221a578a92631fc8ea62dd79fa3b6 commit r14-10425-g1fff665a51e221a578a92631fc8ea62dd79fa3b6 Author: H.J. Lu Date: Tue Apr 26 11:08:55 2022 -0700 x86: Update branch hint for Redwood Cove. According to IntelĀ® 64 and IA-32 Architectures Optimization Refe

[gcc r15-2127] Optimize maskstore when mask is 0 or -1 in UNSPEC_MASKMOV

2024-07-17 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:228972b2b7bf50f4776f8ccae0d7c2950827d0f1 commit r15-2127-g228972b2b7bf50f4776f8ccae0d7c2950827d0f1 Author: liuhongt Date: Tue Jul 16 15:29:01 2024 +0800 Optimize maskstore when mask is 0 or -1 in UNSPEC_MASKMOV gcc/ChangeLog: PR target/115843

[gcc r15-2217] Relax ix86_hardreg_mov_ok after split1.

2024-07-22 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a3f03891065cb9691f6e9cebce4d4542deb92a35 commit r15-2217-ga3f03891065cb9691f6e9cebce4d4542deb92a35 Author: liuhongt Date: Mon Jul 22 11:36:59 2024 +0800 Relax ix86_hardreg_mov_ok after split1. ix86_hardreg_mov_ok is added by r11-5066-gbe39636d9f68c4

[gcc r14-9459] i386[stv]: Handle REG_EH_REGION note

2024-03-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:618e34d56cc38e9c3ae95a413228068e53ed76bb commit r14-9459-g618e34d56cc38e9c3ae95a413228068e53ed76bb Author: liuhongt Date: Wed Mar 13 10:40:01 2024 +0800 i386[stv]: Handle REG_EH_REGION note When we split (insn 37 36 38 10 (set (reg:DI 104 [ _18 ])

[gcc r13-8438] i386[stv]: Handle REG_EH_REGION note

2024-03-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:bdbcfbfcf591381f0faf95c881e3772b56d0a404 commit r13-8438-gbdbcfbfcf591381f0faf95c881e3772b56d0a404 Author: liuhongt Date: Wed Mar 13 10:40:01 2024 +0800 i386[stv]: Handle REG_EH_REGION note When we split (insn 37 36 38 10 (set (reg:DI 104 [ _18 ])

[gcc r12-10214] i386[stv]: Handle REG_EH_REGION note

2024-03-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a861f940efffae2782c559cd04df2d2740cd28bd commit r12-10214-ga861f940efffae2782c559cd04df2d2740cd28bd Author: liuhongt Date: Wed Mar 13 10:40:01 2024 +0800 i386[stv]: Handle REG_EH_REGION note When we split (insn 37 36 38 10 (set (reg:DI 104 [ _18 ])

[gcc r14-9512] Add missing hf/bf patterns.

2024-03-17 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:942d470a5a4fb1baeff943127a81b441dffaa543 commit r14-9512-g942d470a5a4fb1baeff943127a81b441dffaa543 Author: liuhongt Date: Fri Mar 15 10:59:10 2024 +0800 Add missing hf/bf patterns. It will be used by copysignm3/xorsignm3/lroundmn2 expanders. gcc/Chan

[gcc r14-9588] Document -fexcess-precision=16.

2024-03-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:415091f09096a0ebba1fdcd4af8c2fda24cfd411 commit r14-9588-g415091f09096a0ebba1fdcd4af8c2fda24cfd411 Author: liuhongt Date: Mon Mar 18 18:53:59 2024 +0800 Document -fexcess-precision=16. gcc/ChangeLog: PR middle-end/114347 * doc/inv

[gcc r14-9591] Fix runtime error for nonlinear iv vectorization(step_mult).

2024-03-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ac2f8c2a367151fc0410f904339c475a953cffc8 commit r14-9591-gac2f8c2a367151fc0410f904339c475a953cffc8 Author: liuhongt Date: Thu Mar 21 13:15:23 2024 +0800 Fix runtime error for nonlinear iv vectorization(step_mult). wi::from_mpz doesn't take a sign argument, we

[gcc r13-8475] Fix runtime error for nonlinear iv vectorization(step_mult).

2024-03-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:199b021a38f30b681e0dbecd2d0296beabd50b13 commit r13-8475-g199b021a38f30b681e0dbecd2d0296beabd50b13 Author: liuhongt Date: Thu Mar 21 13:15:23 2024 +0800 Fix runtime error for nonlinear iv vectorization(step_mult). wi::from_mpz doesn't take a sign argument, we

[gcc r14-9603] Move pr114396.c from gcc.target/i386 to gcc.c-torture/execute.

2024-03-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:9a6c7aa1b011b77fcd9b19f7b8d7ff0fc823cdb2 commit r14-9603-g9a6c7aa1b011b77fcd9b19f7b8d7ff0fc823cdb2 Author: liuhongt Date: Fri Mar 22 10:09:43 2024 +0800 Move pr114396.c from gcc.target/i386 to gcc.c-torture/execute. Also fixed a typo in the testcase.

[gcc r13-8488] Move pr114396.c from gcc.target/i386 to gcc.c-torture/execute.

2024-03-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e6a3d1f5bcfd954b614155d96c97bde8ac230e2e commit r13-8488-ge6a3d1f5bcfd954b614155d96c97bde8ac230e2e Author: liuhongt Date: Fri Mar 22 10:09:43 2024 +0800 Move pr114396.c from gcc.target/i386 to gcc.c-torture/execute. Also fixed a typo in the testcase.

[gcc r15-22] Adjust alternative *k to ?k for avx512 mask in zero_extend patterns

2024-04-28 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:c19a674d03847b900919b97d0957c8ae5164f8f1 commit r15-22-gc19a674d03847b900919b97d0957c8ae5164f8f1 Author: liuhongt Date: Tue Apr 16 08:37:22 2024 +0800 Adjust alternative *k to ?k for avx512 mask in zero_extend patterns So when both source operand and dest ope

[gcc r15-2395] Refine constraint "Bk" to define_special_memory_constraint.

2024-07-29 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:bc1fda00d5f20e2f3e77a50b2822562b6e0040b2 commit r15-2395-gbc1fda00d5f20e2f3e77a50b2822562b6e0040b2 Author: liuhongt Date: Wed Jul 24 11:29:23 2024 +0800 Refine constraint "Bk" to define_special_memory_constraint. For below pattern, RA may still allocate r162

[gcc r15-2539] Fix mismatch between constraint and predicate for ashl3_doubleword.

2024-08-01 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:64ca25aec4939aea79bd812b089fbb666ca6f2fd commit r15-2539-g64ca25aec4939aea79bd812b089fbb666ca6f2fd Author: liuhongt Date: Fri Jul 26 09:56:03 2024 +0800 Fix mismatch between constraint and predicate for ashl3_doubleword. (insn 98 94 387 2 (parallel [

[gcc r14-10551] Refine constraint "Bk" to define_special_memory_constraint.

2024-08-02 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a295076bee293aa3112c615f9af7a27231816a36 commit r14-10551-ga295076bee293aa3112c615f9af7a27231816a36 Author: liuhongt Date: Wed Jul 24 11:29:23 2024 +0800 Refine constraint "Bk" to define_special_memory_constraint. For below pattern, RA may still allocate r162

[gcc r12-10668] Refine constraint "Bk" to define_special_memory_constraint.

2024-08-11 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:c94738e2462ff46f3013f6270f6a955b749d82b2 commit r12-10668-gc94738e2462ff46f3013f6270f6a955b749d82b2 Author: liuhongt Date: Wed Jul 24 11:29:23 2024 +0800 Refine constraint "Bk" to define_special_memory_constraint. For below pattern, RA may still allocate r162

[gcc r13-8971] Refine constraint "Bk" to define_special_memory_constraint.

2024-08-11 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:617562e4e422c7bd282960b14abfffd994445009 commit r13-8971-g617562e4e422c7bd282960b14abfffd994445009 Author: liuhongt Date: Wed Jul 24 11:29:23 2024 +0800 Refine constraint "Bk" to define_special_memory_constraint. For below pattern, RA may still allocate r162

[gcc r15-2906] Move ix86_align_loops into a separate pass and insert the pass after pass_endbr_and_patchable_area.

2024-08-13 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:c3c83d22d212a35cb1bfb8727477819463f0dcd8 commit r15-2906-gc3c83d22d212a35cb1bfb8727477819463f0dcd8 Author: liuhongt Date: Mon Aug 12 14:35:31 2024 +0800 Move ix86_align_loops into a separate pass and insert the pass after pass_endbr_and_patchable_area. gcc/C

[gcc r15-2930] Movement between GENERAL_REGS and SSE_REGS for TImode doesn't need secondary reload.

2024-08-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:f7e672da8fc3d416a6d07eb01f3be4400ef94fac commit r15-2930-gf7e672da8fc3d416a6d07eb01f3be4400ef94fac Author: liuhongt Date: Mon Aug 12 18:24:34 2024 +0800 Movement between GENERAL_REGS and SSE_REGS for TImode doesn't need secondary reload. It results in 2 fail

[gcc r14-10588] Move ix86_align_loops into a separate pass and insert the pass after pass_endbr_and_patchable_area.

2024-08-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:4e7735a8d87559bbddfe3a985786996e22241f8d commit r14-10588-g4e7735a8d87559bbddfe3a985786996e22241f8d Author: liuhongt Date: Mon Aug 12 14:35:31 2024 +0800 Move ix86_align_loops into a separate pass and insert the pass after pass_endbr_and_patchable_area. gcc/

[gcc r15-814] Fix typo in the testcase.

2024-05-24 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:51f4b47c4f4f61fe31a7bd1fa80e08c2438d76a8 commit r15-814-g51f4b47c4f4f61fe31a7bd1fa80e08c2438d76a8 Author: liuhongt Date: Fri May 24 09:49:08 2024 +0800 Fix typo in the testcase. gcc/testsuite/ChangeLog: PR target/114148 * gcc.targ

[gcc r15-857] Fix predicate mismatch between vfcmaddcph's define_insn and define_expand.

2024-05-27 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:c65002347e595cda8b15e59e734d209283faf2b6 commit r15-857-gc65002347e595cda8b15e59e734d209283faf2b6 Author: liuhongt Date: Tue May 28 10:32:12 2024 +0800 Fix predicate mismatch between vfcmaddcph's define_insn and define_expand. When I applied Roger's patch [1]

[gcc r15-882] Reduce cost of MEM (A + imm).

2024-05-28 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:1d6199e5f8c1c08083eeb0279f71333234fe14ad commit r15-882-g1d6199e5f8c1c08083eeb0279f71333234fe14ad Author: liuhongt Date: Mon Feb 19 13:57:24 2024 +0800 Reduce cost of MEM (A + imm). For MEM, rtx_cost iterates each subrtx, and adds up the costs, so for MEM

[gcc r15-919] Don't reduce estimated unrolled size for innermost loop.

2024-05-29 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ef27b91b62c3aa8841c02665dffa8914c742fd37 commit r15-919-gef27b91b62c3aa8841c02665dffa8914c742fd37 Author: liuhongt Date: Tue Feb 27 15:34:57 2024 +0800 Don't reduce estimated unrolled size for innermost loop. For the innermost loop, after completely loop unro

[gcc r15-920] Support vcond_mask_qiqi and friends.

2024-05-29 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b6c6d5abf0d31c936f50f8f9073c5e335b9e24b7 commit r15-920-gb6c6d5abf0d31c936f50f8f9073c5e335b9e24b7 Author: liuhongt Date: Wed Feb 28 11:17:10 2024 +0800 Support vcond_mask_qiqi and friends. gcc/ChangeLog: * config/i386/sse.md (vcond_mask_): Ne

[gcc r15-932] Rename double_u with __double_u to avoid pulluting the namespace.

2024-05-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:3a873c0a7bc8183de95a6103b507101a25eed413 commit r15-932-g3a873c0a7bc8183de95a6103b507101a25eed413 Author: liuhongt Date: Thu May 30 14:15:48 2024 +0800 Rename double_u with __double_u to avoid pulluting the namespace. gcc/ChangeLog: * config/

[gcc r15-984] Add some preference for floating point rtl ifcvt when sse4.1 is not available

2024-06-03 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ac306de7d5100d3682eae2270995a9abbe19db38 commit r15-984-gac306de7d5100d3682eae2270995a9abbe19db38 Author: liuhongt Date: Fri May 31 14:38:07 2024 +0800 Add some preference for floating point rtl ifcvt when sse4.1 is not available W/o TARGET_SSE4_1, it takes

[gcc r15-1003] Adjust testcase for -march=cascadelake

2024-06-03 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:4d207044195b97ecb27c72a7dc987eb8b86644a0 commit r15-1003-g4d207044195b97ecb27c72a7dc987eb8b86644a0 Author: liuhongt Date: Tue Jun 4 10:13:09 2024 +0800 Adjust testcase for -march=cascadelake gcc/testsuite/ChangeLog: PR target/115299

[gcc r15-1022] Don't simplify NAN/INF or out-of-range constant for FIX/UNSIGNED_FIX.

2024-06-04 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b05288d1f1e4b632eddf8830b4369d4659f6c2ff commit r15-1022-gb05288d1f1e4b632eddf8830b4369d4659f6c2ff Author: liuhongt Date: Tue May 21 16:57:17 2024 +0800 Don't simplify NAN/INF or out-of-range constant for FIX/UNSIGNED_FIX. According to IEEE standard, for conv

[gcc r15-1047] Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode.

2024-06-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:7876cde25cbd2f026a0ae488e5263e72f8e9bfa0 commit r15-1047-g7876cde25cbd2f026a0ae488e5263e72f8e9bfa0 Author: liuhongt Date: Fri Apr 19 10:29:34 2024 +0800 Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode. When mask is (1 << (prec - imm)

[gcc r15-1048] Adjust rtx_cost for MEM to enable more simplication

2024-06-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:961dd0d635217c703a38c48903981e0d60962546 commit r15-1048-g961dd0d635217c703a38c48903981e0d60962546 Author: liuhongt Date: Fri Apr 19 10:39:53 2024 +0800 Adjust rtx_cost for MEM to enable more simplication For CONST_VECTOR_DUPLICATE_P in constant_pool, it is j

[gcc r15-1050] Refine testcase for power10.

2024-06-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:fcfce55c85f842ed843cbc4aabe744c6a004dead commit r15-1050-gfcfce55c85f842ed843cbc4aabe744c6a004dead Author: liuhongt Date: Thu Jun 6 11:27:53 2024 +0800 Refine testcase for power10. For power10, there're extra 3 REG_EQUIV notes with (fix:SI. to avoid the f

[gcc r15-1088] Add additional option --param max-completely-peeled-insns=200 for power64*-*-*

2024-06-06 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b24f2954dbc13d85e9fb62e05a88e9df21e4d4f4 commit r15-1088-gb24f2954dbc13d85e9fb62e05a88e9df21e4d4f4 Author: liuhongt Date: Fri Jun 7 09:29:24 2024 +0800 Add additional option --param max-completely-peeled-insns=200 for power64*-*-* gcc/testsuite/ChangeLog:

[gcc r13-8825] Disable FMADD in chains for Zen4 and generic

2024-06-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e4f85ea6271a10e13c6874709a05e04ab0508fbf commit r13-8825-ge4f85ea6271a10e13c6874709a05e04ab0508fbf Author: Jan Hubicka Date: Fri Dec 29 23:51:03 2023 +0100 Disable FMADD in chains for Zen4 and generic this patch disables use of FMA in matrix multiplication lo

[gcc r12-10497] Disable FMADD in chains for Zen4 and generic

2024-06-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:5d52558a531130675329d72ca5c4713abf5bf885 commit r12-10497-g5d52558a531130675329d72ca5c4713abf5bf885 Author: Jan Hubicka Date: Fri Dec 29 23:51:03 2023 +0100 Disable FMADD in chains for Zen4 and generic this patch disables use of FMA in matrix multiplication l

[gcc r15-1191] Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P

2024-06-11 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:1d496d2cd1d5d8751a1637abca89339d6f9ddd3b commit r15-1191-g1d496d2cd1d5d8751a1637abca89339d6f9ddd3b Author: liuhongt Date: Tue Jun 11 10:23:27 2024 +0800 Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P The patch add extra check to make s

[gcc r15-1234] Fix ICE due to REGNO of a SUBREG.

2024-06-12 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:f8bf80a4e1682b2238baad8c44939682f96b1fe0 commit r15-1234-gf8bf80a4e1682b2238baad8c44939682f96b1fe0 Author: liuhongt Date: Thu Jun 13 09:53:58 2024 +0800 Fix ICE due to REGNO of a SUBREG. Use reg_or_subregno instead. gcc/ChangeLog: PR

[gcc r15-1307] Remove one_if_conv for latest Intel processors.

2024-06-13 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:8b69efd9819f86b973d7a550e987ce455fce6d62 commit r15-1307-g8b69efd9819f86b973d7a550e987ce455fce6d62 Author: liuhongt Date: Mon Jun 3 10:38:19 2024 +0800 Remove one_if_conv for latest Intel processors. The tune is added by PR79390 for SciMark2 on Broadwell.

[gcc r15-1308] Adjust ix86_rtx_costs for pternlog_operand_p.

2024-06-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:d3fae2bea034edb001cd45d1d86c5ceef146899b commit r15-1308-gd3fae2bea034edb001cd45d1d86c5ceef146899b Author: liuhongt Date: Tue Jun 11 21:22:42 2024 +0800 Adjust ix86_rtx_costs for pternlog_operand_p. r15-1100-gec985bc97a0157 improves handling of ternlog instru

[gcc r15-1563] AVX-512: Pacify -Wshift-overflow=2. [PR115409]

2024-06-22 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:4c957d7ba84d8bbce6e778048f38e92ef71806c8 commit r15-1563-g4c957d7ba84d8bbce6e778048f38e92ef71806c8 Author: Collin Funk Date: Mon Jun 10 06:36:47 2024 + AVX-512: Pacify -Wshift-overflow=2. [PR115409] A shift of 31 on a signed int is undefined behavior. Si

[gcc r14-10782] Add new microarchitecture tune for SRF/GRR/CWF.

2024-10-13 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:fe0692f689a18c432d6f59f404d4cd020cbebef2 commit r14-10782-gfe0692f689a18c432d6f59f404d4cd020cbebef2 Author: liuhongt Date: Tue Sep 24 15:53:14 2024 +0800 Add new microarchitecture tune for SRF/GRR/CWF. For Crestmont, 4-operand vex blendv instructions come fro

[gcc r14-10783] Add a new tune avx256_avoid_vec_perm for SRF.

2024-10-13 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:9b7d5ecbecfbd193899648e411f1a9b2a77471e2 commit r14-10783-g9b7d5ecbecfbd193899648e411f1a9b2a77471e2 Author: liuhongt Date: Wed Sep 25 13:11:11 2024 +0800 Add a new tune avx256_avoid_vec_perm for SRF. According to Intel SOM[1], For Crestmont, most 256-bit Int

[gcc r13-9117] Add new microarchitecture tune for SRF/GRR/CWF.

2024-10-16 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e9eadc29c1c57cd7be9ec8de231d8fb9e8ac0c7c commit r13-9117-ge9eadc29c1c57cd7be9ec8de231d8fb9e8ac0c7c Author: liuhongt Date: Tue Sep 24 15:53:14 2024 +0800 Add new microarchitecture tune for SRF/GRR/CWF. For Crestmont, 4-operand vex blendv instructions come from

[gcc r13-9118] Add a new tune avx256_avoid_vec_perm for SRF.

2024-10-16 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:eecd5f8ce1729a214bf0a1edfdd3ee1cf79be881 commit r13-9118-geecd5f8ce1729a214bf0a1edfdd3ee1cf79be881 Author: liuhongt Date: Wed Sep 25 13:11:11 2024 +0800 Add a new tune avx256_avoid_vec_perm for SRF. According to Intel SOM[1], For Crestmont, most 256-bit Inte

[gcc r14-10807] Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw"

2024-10-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:79e7e02b7cc578d03eab2b50c029f44409ef8e26 commit r14-10807-g79e7e02b7cc578d03eab2b50c029f44409ef8e26 Author: liuhongt Date: Wed Oct 16 13:43:48 2024 +0800 Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw" r12-6103-g1a7ce8570997eb combines

[gcc r15-4510] Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw"

2024-10-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:5259d3927c1c8e3a15b4b844adef59b48c241233 commit r15-4510-g5259d3927c1c8e3a15b4b844adef59b48c241233 Author: liuhongt Date: Wed Oct 16 13:43:48 2024 +0800 Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw" r12-6103-g1a7ce8570997eb combines v

[gcc r13-9139] Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw"

2024-10-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:fca35b417c236e3448bc3666820fd1ba423fe6e9 commit r13-9139-gfca35b417c236e3448bc3666820fd1ba423fe6e9 Author: liuhongt Date: Wed Oct 16 13:43:48 2024 +0800 Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw" r12-6103-g1a7ce8570997eb combines v

[gcc r12-10778] Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw"

2024-10-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:91800a70a2af1349eefc5f3380be2b254b1db395 commit r12-10778-g91800a70a2af1349eefc5f3380be2b254b1db395 Author: liuhongt Date: Wed Oct 16 13:43:48 2024 +0800 Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw" r12-6103-g1a7ce8570997eb combines

[gcc r13-9142] [GCC13/GCC12] Fix testcase.

2024-10-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:8b43518a01cbbbafe042b85a48fa09a32948380a commit r13-9142-g8b43518a01cbbbafe042b85a48fa09a32948380a Author: liuhongt Date: Tue Oct 22 11:24:23 2024 +0800 [GCC13/GCC12] Fix testcase. The optimization relies on other patterns which are only available at GCC1

[gcc r12-10781] [GCC13/GCC12] Fix testcase.

2024-10-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:45bde60836d04cce4637b74ecadbb0aff90b832f commit r12-10781-g45bde60836d04cce4637b74ecadbb0aff90b832f Author: liuhongt Date: Tue Oct 22 11:24:23 2024 +0800 [GCC13/GCC12] Fix testcase. The optimization relies on other patterns which are only available at GCC

[gcc r15-4225] Enable vectorization for unknown tripcount in very cheap cost model but disable epilog vectorization

2024-10-09 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:70c3db511ba14ff5fa68cb41d0714a9fb957ea5d commit r15-4225-g70c3db511ba14ff5fa68cb41d0714a9fb957ea5d Author: liuhongt Date: Mon Mar 25 21:28:14 2024 -0700 Enable vectorization for unknown tripcount in very cheap cost model but disable epilog vectorization. gcc

[gcc r15-4226] Adjust testcase after relax O2 vectorization.

2024-10-09 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:d5d1189c12199db79f6feb5cfcc7e6475c3a4d91 commit r15-4226-gd5d1189c12199db79f6feb5cfcc7e6475c3a4d91 Author: liuhongt Date: Thu Sep 19 13:38:34 2024 +0800 Adjust testcase after relax O2 vectorization. gcc/testsuite/ChangeLog: * gcc.dg/fstack-pr

[gcc r15-4234] Add a new tune avx256_avoid_vec_perm for SRF.

2024-10-09 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:9eaecce3d8c1d9349adbf8c2cdaf8d87672ed29c commit r15-4234-g9eaecce3d8c1d9349adbf8c2cdaf8d87672ed29c Author: liuhongt Date: Wed Sep 25 13:11:11 2024 +0800 Add a new tune avx256_avoid_vec_perm for SRF. According to Intel SOM[1], For Crestmont, most 256-bit Inte

[gcc r15-4233] Add new microarchitecture tune for SRF/GRR/CWF.

2024-10-09 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:9c8cea8feb6cd54ef73113a0b74f1df7b60d09dc commit r15-4233-g9c8cea8feb6cd54ef73113a0b74f1df7b60d09dc Author: liuhongt Date: Tue Sep 24 15:53:14 2024 +0800 Add new microarchitecture tune for SRF/GRR/CWF. For Crestmont, 4-operand vex blendv instructions come from

[gcc r15-4560] i386: Optimize EQ/NE comparison between avx512 kmask and -1.

2024-10-22 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ee7e77e9c121f5a6f27c92b6b24b2abf9cd66a4d commit r15-4560-gee7e77e9c121f5a6f27c92b6b24b2abf9cd66a4d Author: liuhongt Date: Mon Oct 21 02:22:08 2024 -0700 i386: Optimize EQ/NE comparison between avx512 kmask and -1. r15-974-gbf7745f887c765e06f2e75508f263debb60a

[gcc r14-10831] Fix ICE due to isa mismatch for the builtins.

2024-10-23 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b718f6ec1674c0db30f26c65b7a9215e9388dd6c commit r14-10831-gb718f6ec1674c0db30f26c65b7a9215e9388dd6c Author: liuhongt Date: Tue Oct 22 01:54:40 2024 -0700 Fix ICE due to isa mismatch for the builtins. gcc/ChangeLog: PR target/117240

[gcc r13-9145] Fix ICE due to isa mismatch for the builtins.

2024-10-23 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:2452387468423882c0732e0fad3a83e887574ccc commit r13-9145-g2452387468423882c0732e0fad3a83e887574ccc Author: liuhongt Date: Tue Oct 22 01:54:40 2024 -0700 Fix ICE due to isa mismatch for the builtins. gcc/ChangeLog: PR target/117240

  1   2   >