[Bug tree-optimization/99504] New: Missing memmove detection

2021-03-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99504 Bug ID: 99504 Summary: Missing memmove detection Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization

[Bug target/99704] volatile is needed on asm statements in

2021-03-21 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99704 --- Comment #2 from Hongtao.liu --- How should we handle -march=native on hybrid core?

[Bug target/99704] volatile is needed on asm statements in

2021-03-22 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99704 --- Comment #3 from Hongtao.liu --- (In reply to Hongtao.liu from comment #2) > How should we handle -march=native on hybrid core? Nevermind, assume you're meaning the bellow parts are different on hybrid core 02H EAX Cache and TLB Information

[Bug target/96858] Many i386 testcases failed with different configured gcc on different hosts.

2021-03-24 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96858 Hongtao.liu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug tree-optimization/96244] Redudant mask load generated

2021-03-24 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96244 Hongtao.liu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/99744] __attribute__ ((target("general-regs-only"))) doesn't work with GPR intrinsics

2021-03-24 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99744 --- Comment #2 from Hongtao.liu --- in ix86_can_inline_p static bool ix86_can_inline_p (tree caller, tree callee) { tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller); tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); ...

[Bug target/99754] [sse2] new _mm_loadu_si16 and _mm_loadu_si32 implemented incorrectly

2021-03-25 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99754 --- Comment #1 from Hongtao.liu --- Yes, __mm_set_epi32 will reverse order of parameters, Could you send out a patch for review?

[Bug target/99881] New: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX

2021-04-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881 Bug ID: 99881 Summary: Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Pri

[Bug target/99908] SIMD: negating logical + if_else has a suboptimal codegen.

2021-04-06 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99908 --- Comment #2 from Hongtao.liu --- I'm testing @@ -17759,6 +17759,35 @@ (define_insn "_pblendvb" (set_attr "btver2_decode" "vector,vector,vector") (set_attr "mode" "")]) +(define_split + [(set (match_operand:VI1_AVX2 0 "register_opera

[Bug target/99881] Regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX

2021-04-06 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99881 --- Comment #4 from Hongtao.liu --- (In reply to Richard Biener from comment #3) > But 2 element construction _should_ be cheap. What is missing is the move > cost from GPR to XMM regs (but we do not have a good idea whether the sources > are me

[Bug target/99908] SIMD: negating logical + if_else has a suboptimal codegen.

2021-04-06 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99908 --- Comment #3 from Hongtao.liu --- Created attachment 50517 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50517&action=edit tested patch waiting for GCC12.

[Bug target/99941] m_ALDERLAKE is missing from m_CORE_AVX2

2021-04-06 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99941 --- Comment #1 from Hongtao.liu --- If we were more concerned about the performance of the big core, the answer would be yes.

[Bug target/99941] m_ALDERLAKE is missing from m_CORE_AVX2

2021-04-06 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99941 --- Comment #2 from Hongtao.liu --- (In reply to H.J. Lu from comment #0) > i386-options.c has > > #define m_ALDERLAKE (HOST_WIDE_INT_1U< #define m_CORE_AVX512 (m_SKYLAKE_AVX512 | m_CANNONLAKE \ >| m_ICELAKE_CLIENT | m_IC

[Bug rtl-optimization/99930] Failure to optimize floating point -abs(x) in nontrivial code at -O2/3

2021-04-07 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99930 --- Comment #7 from Hongtao.liu --- i'm testing 1 file changed, 30 insertions(+) gcc/combine.c | 30 ++ modified gcc/combine.c @@ -1811,6 +1811,33 @@ set_nonzero_bits_and_sign_copies (rtx x, const_rtx set, void *dat

[Bug rtl-optimization/99930] Failure to optimize floating point -abs(x) in nontrivial code at -O2/3

2021-04-08 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99930 --- Comment #9 from Hongtao.liu --- (In reply to Segher Boessenkool from comment #8) > That patch is no good. The combination is not allowed because it is not > known what the "use"s are *for*. Checking if something is from the constant > pools

[Bug target/100009] [9 Regression] -march=native doesn't work on tigerlake

2021-04-12 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #3

[Bug target/100009] [9 Regression] -march=native doesn't work on tigerlake

2021-04-12 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19 --- Comment #4 from Hongtao.liu --- > Oops, > will backport r10-2664-ga9fcfec30f70c30883f53d4b1bd533fbea0e9fb2 (tigerlake > part) to gcc9. PTA_AVX512VP2INTERSECT is enabled in GCC10, don't plan to backport to gcc9, so in GCC9 -march=native wou

[Bug tree-optimization/100076] New: eembc/automotive/basefp01 has 30.3% regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX

2021-04-13 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100076 Bug ID: 100076 Summary: eembc/automotive/basefp01 has 30.3% regression compare -O2 -ftree-vectorize with -O2 on SKX/CLX Product: gcc Version: 11.0 Status: UNCONFIRMED

[Bug tree-optimization/100076] eembc/automotive/basefp01 has 30.3% regression compare -O2 -ftree-vectorize with -O2 on CLX/Znver3

2021-04-13 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100076 --- Comment #2 from Hongtao.liu --- (In reply to H.J. Lu from comment #1) > Is -O3 slower than -O3 -fno-tree-vectorize? If not, why? For this case O3 is Ok, because O3 will enable pass_cunroll to complete unroll the loop1/loop2/loop3, and later

[Bug tree-optimization/100076] eembc/automotive/basefp01 has 30.3% regression compare -O2 -ftree-vectorize with -O2 on CLX/Znver3

2021-04-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100076 --- Comment #4 from Hongtao.liu --- Created attachment 50590 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50590&action=edit eembc_automotive_basefp01.cpp

[Bug tree-optimization/100089] New: [11 Performance regression ] 30% for denbench/mp2decoddata2 with -O3

2021-04-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100089 Bug ID: 100089 Summary: [11 Performance regression ] 30% for denbench/mp2decoddata2 with -O3 Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal

[Bug target/100088] ymm store split into two xmm stores

2021-04-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100088 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #2

[Bug target/100088] ymm store split into two xmm stores

2021-04-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100088 --- Comment #3 from Hongtao.liu --- (In reply to Hongtao.liu from comment #2) > > > > This issue does not exist for sse or avx512f. Setting `-march=haswell` or > > `-mtune=haswell` on the command line also seems to fix this but neither of > > t

[Bug target/100088] ymm store split into two xmm stores

2021-04-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100088 --- Comment #4 from Hongtao.liu --- (In reply to Hongtao.liu from comment #3) > (In reply to Hongtao.liu from comment #2) > > > > > > This issue does not exist for sse or avx512f. Setting `-march=haswell` or > > > `-mtune=haswell` on the comman

[Bug target/100093] New: different behavior between -mtune=cpu_type and target_attribute (“arch=cputype”)

2021-04-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100093 Bug ID: 100093 Summary: different behavior between -mtune=cpu_type and target_attribute (“arch=cputype”) Product: gcc Version: 11.0 Status: UNCONFIRMED Severit

[Bug tree-optimization/100076] eembc/automotive/basefp01 has 30.3% regression compare -O2 -ftree-vectorize with -O2 on CLX/Znver3

2021-04-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100076 --- Comment #6 from Hongtao.liu --- (In reply to Richard Biener from comment #5) > Note even when avoiding the STLF hit the vectorized version is slower. > You can use -mtune-ctl=^sse_unaligned_load_optimal to force loading > the lower/upper hal

[Bug target/100093] different behavior between -mtune=cpu_type and target_attribute (“arch=cputype”)

2021-04-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100093 --- Comment #1 from Hongtao.liu --- When ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD/STORE_OPTIMAL] is false, GCC goes to set up the bit MASK_AVX256_SPLIT_UNALIGNED_LOAD/STORE, but when ix86_tune_features[X86_TUNE_AVX256_UNALIGNED_LOAD/STO

[Bug target/100009] [9 Regression] -march=native doesn't work on tigerlake

2021-04-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=19 --- Comment #5 from Hongtao.liu --- (In reply to Hongtao.liu from comment #3) > > Response from Jim Wilson: > > Looks like a bug in gcc-9. tigerlake was added to > > gcc/config/i386/driver-i386.c but not to the arch_names_table in i386.c. I >

[Bug target/100093] different behavior between -mtune=cpu_type and target_attribute (“arch=cputype”)

2021-04-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100093 --- Comment #2 from Hongtao.liu --- Created attachment 50611 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=50611&action=edit tested patch waiting for GCC12. [i386] MASK_AVX256_SPLIT_UNALIGNED_STORE/LOAD should be cleared in opts->x_targ

[Bug target/98348] [10 Regression] GCC 10.2 AVX512 Mask regression from GCC 9

2021-04-16 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98348 --- Comment #21 from Hongtao.liu --- (In reply to Dávid Bolvanský from comment #20) > Some small regression (missed opportunity to use vptestnmd): > > Current trunk > > compare(unsigned int __vector(16)): > vpxor xmm1, xmm1, xmm1 > vpcmpd k

[Bug tree-optimization/100173] New: telecom/viterb00data_1 has 16.92% regression compared O2 -ftree-vectorize -fvect-cost-model=very-cheap to O2 on CLX/ICX, 9% regression on znver3

2021-04-20 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100173 Bug ID: 100173 Summary: telecom/viterb00data_1 has 16.92% regression compared O2 -ftree-vectorize -fvect-cost-model=very-cheap to O2 on CLX/ICX, 9% regression on znver3 Pr

[Bug target/94680] Missed optimization with __builtin_shuffle and zero vector

2021-04-21 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94680 --- Comment #4 from Hongtao.liu --- Let me do this.

[Bug target/100093] different behavior between -mtune=cpu_type and target_attribute (“arch=cputype”)

2021-04-22 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100093 Hongtao.liu changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/98911] Add folding and remove expanders for x86 *pcmp{et,gt}* builtins

2021-04-24 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98911 --- Comment #2 from Hongtao.liu --- Fixed in GCC12.

[Bug rtl-optimization/100253] [10/11/12 Regression] wrong code with -O2 -fno-tree-bit-ccp -ftree-slp-vectorize (unaligned movdqa)

2021-04-25 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100253 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #1

[Bug target/82735] _mm256_zeroupper does not invalidate previously computed registers

2021-04-25 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82735 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #7 f

[Bug rtl-optimization/97249] Missing vec_select and subreg optimization

2020-10-11 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97249 --- Comment #4 from Hongtao.liu --- (In reply to Richard Biener from comment #3) > Guess you want to figure what built the (vec_select:V8QI (V16QI)) and if > it was appropriately simplified (and simplify_rtx would handle this case). > In any case

[Bug target/97286] GCC sometimes uses an extra xmm register for the destination of _mm_blend_ps

2020-10-11 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97286 --- Comment #2 from Hongtao.liu --- Seems similar issue as PR97366?

[Bug target/96849] [11 Regression] ICE: in extract_insn, at recog.c:2294 (error: unrecognizable insn) since r11-2623

2020-10-12 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96849 --- Comment #4 from Hongtao.liu --- Fixed in GCC11 by https://gcc.gnu.org/g:1aa71af09350b9ff4d2fad88a440b682545682ec commit r11-2947-g1aa71af09350b9ff4d2fad88a440b682545682ec Author: liuhongt Date: Tue Aug 11 11:05:40 2020 +0800 Refine

[Bug rtl-optimization/97249] Missing vec_select and subreg optimization

2020-10-12 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97249 --- Comment #6 from Hongtao.liu --- We all ready have bellow in simplify-rtx.c, it seems we can also handle such situation here. --- 3954 case VEC_SELECT: 3955 if (!VECTOR_MODE_P (mode)) 3956 { 3957 gcc_assert (VECTOR

[Bug rtl-optimization/97387] we are near 2021, add carry intrinsic still does the wrong thing and generates silly code.

2020-10-12 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97387 --- Comment #2 from Hongtao.liu --- Same issue as PR93990?

[Bug rtl-optimization/97249] Missing vec_select and subreg optimization

2020-10-12 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97249 --- Comment #7 from Hongtao.liu --- I'm testing --- diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c index 869f0d11b2e..9c397157f28 100644 --- a/gcc/simplify-rtx.c +++ b/gcc/simplify-rtx.c @@ -4170,6 +4170,33 @@ simplify_binary_operation_1 (e

[Bug target/97194] optimize vector element set/extract at variable position

2020-10-16 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194 --- Comment #15 from Hongtao.liu --- I'm working on add the expander, i encounter a problem. for V32HI vec_set with constant index, the expander existed under TARGET_AVX512F, but for variable index, the expander should be existed under TARGET_AV

[Bug target/97194] optimize vector element set/extract at variable position

2020-10-16 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194 --- Comment #16 from Hongtao.liu --- (In reply to Hongtao.liu from comment #15) > I'm working on add the expander, i encounter a problem. > > for V32HI vec_set with constant index, the expander existed under > TARGET_AVX512F, but for variable in

[Bug target/97194] optimize vector element set/extract at variable position

2020-10-16 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194 --- Comment #19 from Hongtao.liu --- (In reply to Richard Biener from comment #17) > (In reply to Hongtao.liu from comment #15) > > I'm working on add the expander, i encounter a problem. > > > > for V32HI vec_set with constant index, the expand

[Bug target/97194] optimize vector element set/extract at variable position

2020-10-16 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194 --- Comment #20 from Hongtao.liu --- (In reply to Richard Biener from comment #18) > (In reply to Hongtao.liu from comment #16) > > (In reply to Hongtao.liu from comment #15) > > > I'm working on add the expander, i encounter a problem. > > > >

[Bug target/97366] [8/9/10/11 Regression] Redundant load with SSE/AVX vector intrinsics

2020-10-20 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97366 --- Comment #6 from Hongtao.liu --- (In reply to Alexander Monakov from comment #5) > afaict LRA is just following IRA decisions, and IRA allocates that pseudo to > memory due to costs. > > Not sure where strange cost is coming from, but it depe

[Bug target/97506] [11 Regression] ICE: in extract_insn, at recog.c:2294 (unrecognizable insn) with -mavx512vbmi -mavx512vl

2020-10-20 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97506 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #2 f

[Bug target/97506] [11 Regression] ICE: in extract_insn, at recog.c:2294 (unrecognizable insn) with -mavx512vbmi -mavx512vl

2020-10-21 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97506 --- Comment #5 from Hongtao.liu --- (In reply to Jakub Jelinek from comment #4) > Yeah. On the other side, they don't need to try hard to optimize it because > normally it should be simplified already. So, e.g. the above patch is fine > if it w

[Bug target/97506] [11 Regression] ICE: in extract_insn, at recog.c:2294 (unrecognizable insn) with -mavx512vbmi -mavx512vl

2020-10-21 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97506 --- Comment #7 from Hongtao.liu --- Should i backport to GCC10? Although it's exposed in GCC11, but it's still a potential bug in GCC10.

[Bug rtl-optimization/97249] Missing vec_select and subreg optimization

2020-10-21 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97249 Hongtao.liu changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/97521] [11 Regression] wrong code with -mno-sse2 since r11-3394

2020-10-22 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #6 f

[Bug target/97521] [11 Regression] wrong code with -mno-sse2 since r11-3394

2020-10-22 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97521 --- Comment #10 from Hongtao.liu --- Speaking about how to represent the V*BImode constants, i think we need to extend attribute vector_size to handle something like --- typedef bool v8bi __attribute__ ((vector_size (1))); --- currently there wo

[Bug target/97532] [11 Regression] Error: insn does not satisfy its constraints, internal compiler error: in extract_constrain_insn, at recog.c:2196

2020-10-22 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97532 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #6 f

[Bug target/97532] [11 Regression] Error: insn does not satisfy its constraints, internal compiler error: in extract_constrain_insn, at recog.c:2196

2020-10-23 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97532 --- Comment #8 from Hongtao.liu --- (In reply to Jakub Jelinek from comment #7) > memory_operand calls general_operand which for MEM does: > /* Use the mem's mode, since it will be reloaded thus. LRA can > generate move insn with

[Bug rtl-optimization/97540] [11 Regression] ICE in lra_set_insn_recog_data, at lra.c:1004 since r11-4202-g4de7b010038933dd

2020-10-23 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97540 --- Comment #2 from Hongtao.liu --- 2588 /* For special_memory_operand, there could be a memory operand inside, 2589 and it would cause a mismatch for constraint_satisfied_p. */ 2590 if (UNARY_P (op) && op == ext

[Bug target/97532] [11 Regression] Error: insn does not satisfy its constraints, internal compiler error: in extract_constrain_insn, at recog.c:2196

2020-10-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97532 --- Comment #10 from Hongtao.liu --- Created attachment 49444 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49444&action=edit Fix invalid address for special memory constraint I'm testing this patch.

[Bug rtl-optimization/97540] [11 Regression] ICE in lra_set_insn_recog_data, at lra.c:1004 since r11-4202-g4de7b010038933dd

2020-10-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97540 --- Comment #4 from Hongtao.liu --- Created attachment 49445 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49445&action=edit Don't extract memory from operand for normal memory constraint. I'm testing this patch.

[Bug target/97606] internal compiler error: in extract_constrain_insn, at recog.c:2196

2020-10-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97606 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #1 f

[Bug tree-optimization/97603] Failure to optimize out compare into reuse of subtraction result

2020-10-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97603 --- Comment #1 from Hongtao.liu --- Shouldn't it be marked as target issue for x86?

[Bug tree-optimization/97603] Failure to optimize out compare into reuse of subtraction result

2020-10-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97603 --- Comment #2 from Hongtao.liu --- (In reply to Hongtao.liu from comment #1) > Shouldn't it be marked as target issue for x86? Or you means that middle-end should transform code to int g(); int f(int a, int b) { int c = a - b; if (c)

[Bug rtl-optimization/97540] [11 Regression] ICE in lra_set_insn_recog_data, at lra.c:1004 since r11-4202-g4de7b010038933dd

2020-10-29 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97540 --- Comment #5 from Hongtao.liu --- The patch is posted at https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557143.html

[Bug target/97532] [11 Regression] Error: insn does not satisfy its constraints, internal compiler error: in extract_constrain_insn, at recog.c:2196

2020-10-29 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97532 --- Comment #13 from Hongtao.liu --- (In reply to Tom de Vries from comment #12) > (In reply to Hongtao.liu from comment #10) > > Created attachment 49444 [details] > > Fix invalid address for special memory constraint > > > > I'm testing this p

[Bug inline-asm/97667] [11 Regression] a bug in asm_operand_ok() recog.c:1801

2020-11-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97667 --- Comment #3 from Hongtao.liu --- (In reply to Martin Liška from comment #2) > Likely dup of PR97540. Yes, it should be.

[Bug rtl-optimization/97540] [11 Regression] ICE in lra_set_insn_recog_data, at lra.c:1004 since r11-4202-g4de7b010038933dd

2020-11-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97540 --- Comment #7 from Hongtao.liu --- (In reply to Hongtao.liu from comment #5) > The patch is posted at > https://gcc.gnu.org/pipermail/gcc-patches/2020-October/557143.html With upper patch and https://gcc.gnu.org/pipermail/gcc-patches/2020-Octob

[Bug target/97685] New: -march=tremont should enable MOVDIRI/MOVDIR64B/CLDEMOTE/SGX/WAITPKG.

2020-11-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97685 Bug ID: 97685 Summary: -march=tremont should enable MOVDIRI/MOVDIR64B/CLDEMOTE/SGX/WAITPKG. Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal

[Bug target/97642] Incorrect replacement of vmovdqu32 with vpblendd can cause fault

2020-11-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97642 --- Comment #3 from Hongtao.liu --- (In reply to Jakub Jelinek from comment #1) > The problem is that in the RTL representation there is nothing that would > tell cse, forward propagation or combiner etc. not to optimize the > (insn 7 6 8 2 (set

[Bug rtl-optimization/97540] [11 Regression] ICE in lra_set_insn_recog_data, at lra.c:1004 since r11-4202-g4de7b010038933dd

2020-11-03 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97540 --- Comment #9 from Hongtao.liu --- Fixed in GCC11.

[Bug target/97532] [11 Regression] Error: insn does not satisfy its constraints, internal compiler error: in extract_constrain_insn, at recog.c:2196

2020-11-03 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97532 --- Comment #15 from Hongtao.liu --- Fixed in GCC11.

[Bug target/97685] -march=tremont should enable MOVDIRI/MOVDIR64B/CLDEMOTE/SGX/WAITPKG.

2020-11-04 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97685 --- Comment #1 from Hongtao.liu --- HRESET wouldn't be supported on SPR

[Bug libstdc++/97759] Could std::has_single_bit be faster?

2020-11-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97759 --- Comment #3 from Hongtao.liu --- for testcase: --- #include bool is_power2_popcnt (int a) { return __builtin_popcount (a) == 1; } bool is_power2_arithmetic (int a) { return !(a & (a - 1)) && a; } --- gcc -O2 -mavx2 -S got --- .

[Bug target/97685] -march=tremont should enable MOVDIRI/MOVDIR64B/CLDEMOTE/SGX/WAITPKG.

2020-11-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97685 Hongtao.liu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/97770] New: Missing vectorization for vpopcnt

2020-11-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770 Bug ID: 97770 Summary: Missing vectorization for vpopcnt Product: gcc Version: 11.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority:

[Bug target/97770] Missing vectorization for vpopcnt

2020-11-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770 --- Comment #1 from Hongtao.liu --- For target side, we need to add expander for popcountm2 with m vector mode

[Bug libstdc++/97759] Could std::has_single_bit be faster?

2020-11-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97759 --- Comment #11 from Hongtao.liu --- (In reply to gcc-bugs from comment #10) > And maybe a related question: > > I know that an arithmetic implementation might auto-vectorize, but would a > popcount implementation do that too? > > Since AVX512_

[Bug target/97770] [ICELAKE]Missing vectorization for vpopcnt

2020-11-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770 --- Comment #2 from Hongtao.liu --- After adding expander, successfully vectorize the loop. --- diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index b153a87fb98..e8159997c40 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/

[Bug target/97770] [ICELAKE]Missing vectorization for vpopcnt

2020-11-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770 --- Comment #3 from Hongtao.liu --- > But for vector byte/word/quadword, vectorizer still use vpopcntd, but not > vpopcnt{b,w,q}, missing corresponding ifn? We don't have __builtin_popcount{w,b}, but we have __builtin_popcountl. for testcase --

[Bug target/97770] [ICELAKE]Missing vectorization for vpopcnt

2020-11-10 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770 --- Comment #5 from Hongtao.liu --- (In reply to Richard Biener from comment #4) > What's missing is middle-end folding support to narrow popcount to the > appropriate internal function call with byte/half-word width when target > support > is av

[Bug target/97779] Newest releases/gcc-10 cannot build because lack of PTA_CLDEMOTE

2020-11-10 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97779 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #1 f

[Bug target/97779] Newest releases/gcc-10 cannot build because lack of PTA_CLDEMOTE

2020-11-10 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97779 --- Comment #2 from Hongtao.liu --- patch posted at https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558578.html

[Bug target/97779] [9/10 Regression] Newest releases/gcc-10 cannot build because lack of PTA_CLDEMOTE

2020-11-10 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97779 --- Comment #4 from Hongtao.liu --- Fixed in GCC10,GCC9.

[Bug target/97770] [ICELAKE]Missing vectorization for vpopcnt

2020-11-11 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770 --- Comment #9 from Hongtao.liu --- > I guess that the vectorized popcount IFN is defined to be VnDI -> VnDI > but we want to have VnSImode results. This means the instruction is > wrongly modeled in vectorized form? > Yes, because we have __

[Bug target/97770] [ICELAKE]Missing vectorization for vpopcnt

2020-11-12 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770 --- Comment #11 from Hongtao.liu --- A patch is posted at https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558777.html

[Bug middle-end/92492] AVX512: Missed vectorization opportunity

2020-11-12 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92492 --- Comment #7 from Hongtao.liu --- I notice TARGET_VECTORIZE_RELATED_MODE is added, and can be used to handle convertion, i'm working on this.

[Bug target/97194] optimize vector element set/extract at variable position

2020-11-16 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97194 --- Comment #23 from Hongtao.liu --- Fixed in GCC11, may need a bit adjustment for the modeless operand(the variable index) as dicussed in https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559213.html

[Bug target/97873] Failure to optimize abs optimally (at least one completely useless instruction on x86)

2020-11-17 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97873 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #3 f

[Bug target/97891] [x86] Consider using registers on large initializations

2020-11-18 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97891 --- Comment #3 from Hongtao.liu --- This problem is very similar to the one pass_rpad deals with.

[Bug target/97891] [x86] Consider using registers on large initializations

2020-11-19 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97891 --- Comment #4 from Hongtao.liu --- (In reply to Hongtao.liu from comment #3) > This problem is very similar to the one pass_rpad deals with. We already have mov_xor for mov $0 to reg, so we only need to handle mov $0 to mem. and size for: xor

[Bug target/96906] Failure to optimize __builtin_ia32_psubusw128 compared to 0 to __builtin_ia32_pminuw128 compared to operand

2020-11-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96906 --- Comment #7 from Hongtao.liu --- (In reply to Jakub Jelinek from comment #6) > Implemented now for non-AVX512*. Hongtao, do you think you could have a > look at the avx512{bw,vl}/avx512bw splitter(s)? Yes, i'll do it. Thanks for the patch.

[Bug target/97770] [ICELAKE]Missing vectorization for vpopcnt

2020-12-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97770 --- Comment #13 from Hongtao.liu --- (In reply to Richard Biener from comment #10) > Hmm, but > > DEF_INTERNAL_INT_FN (POPCOUNT, ECF_CONST | ECF_NOTHROW, popcount, unary) > > so there's clearly a mismatch between either the vectorizers interpre

[Bug target/97642] Incorrect replacement of vmovdqu32 with vpblendd can cause fault

2020-12-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97642 --- Comment #6 from Hongtao.liu --- Fixed in GCC11, GCC10 is fine, no need to backport.

[Bug target/96906] Failure to optimize __builtin_ia32_psubusw128 compared to 0 to __builtin_ia32_pminuw128 compared to operand

2020-12-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96906 --- Comment #9 from Hongtao.liu --- Fixed in GCC11.

[Bug target/98114] New: [11 regression] FAIL: gcc.target/i386/avx512vl-vandnpd-2.c execution test caused by r11-5391

2020-12-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98114 Bug ID: 98114 Summary: [11 regression] FAIL: gcc.target/i386/avx512vl-vandnpd-2.c execution test caused by r11-5391 Product: gcc Version: 11.0 Status:

[Bug target/98114] [11 regression] FAIL: gcc.target/i386/avx512vl-vandnpd-2.c execution test caused by r11-5391

2020-12-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98114 --- Comment #1 from Hongtao.liu --- Looking at testcase there's are pointer type conversion void CALC (double *s1, double *s2, double *r) { int i; long long tmp; for (i = 0; i < SIZE; i++) { tmp = (~(*(long long *) &s1[i])) & (*

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2020-12-07 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #3 from Hongtao.liu --- ;; _3 = __builtin_ia32_shufps (b_2(D), b_2(D), 0); (insn 7 6 8 (set (reg:V4SF 88) (reg/v:V4SF 86 [ b ])) "./gcc/include/xmmintrin.h":746:19 -1 (nil)) (insn 8 7 9 (set (reg:V4SF 89) (reg/v

[Bug testsuite/98114] [11 regression] FAIL: gcc.target/i386/avx512vl-vandnpd-2.c execution test caused by r11-5391

2020-12-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98114 Hongtao.liu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug rtl-optimization/98212] X86 unoptimal code for float equallity comparison followed by jump

2020-12-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98212 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #1 f

[Bug target/98218] New: [TARGET_MMX_WITH_SSE] Miss vec_cmpmn/vcondmn expander for 64bit vector

2020-12-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98218 Bug ID: 98218 Summary: [TARGET_MMX_WITH_SSE] Miss vec_cmpmn/vcondmn expander for 64bit vector Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal

[Bug target/98219] New: User-interrupt return pop corrupt RIP

2020-12-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98219 Bug ID: 98219 Summary: User-interrupt return pop corrupt RIP Product: gcc Version: 11.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target

[Bug target/98219] User-interrupt return pop corrupt RIP

2020-12-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98219 --- Comment #2 from Hongtao.liu --- (In reply to H.J. Lu from comment #1) > Created attachment 49723 [details] > A patch > > Hongtao, can you take it over? I'll validate it.

  1   2   3   4   5   6   7   8   9   10   >