[Bug middle-end/117838] New: IRA issues: The higher cost variable a is spilled for the lower cost variable conflict_a in improve_allocatuion()

2024-11-28 Thread lili.cui at intel dot com via Gcc-bugs
: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: lili.cui at intel dot com Target Milestone: --- Created attachment 59740 --> https://gcc.gnu.org/bugzi

[Bug target/117192] [15 Regression] wrong code at -O3 with "-fno-unswitch-loops" on x86_64-linux-gnu since r15-4397-g70f59d2a1c51bd

2024-10-17 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117192 --- Comment #14 from cuilili --- (In reply to Uroš Bizjak from comment #12) > Created attachment 59373 [details] > Proposed patch > > Patch in testing. Sorry, I made a mistake here, thanks!

[Bug middle-end/110148] [14 Regression] TSVC s242 regression between g:c0df96b3cda5738afbba3a65bb054183c5cd5530 and g:e4c986fde56a6248f8fbe6cf0704e1da34b055d8

2023-09-25 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148 --- Comment #7 from cuilili --- (In reply to Martin Jambor from comment #6) > I believe this has been fixed? Yes.

[Bug middle-end/110148] [14 Regression] TSVC s242 regression between g:c0df96b3cda5738afbba3a65bb054183c5cd5530 and g:e4c986fde56a6248f8fbe6cf0704e1da34b055d8

2023-06-24 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148 --- Comment #3 from cuilili --- I reproduced S1244 regression on znver3. Src code: for (int i = 0; i < LEN_1D-1; i++) { a[i] = b[i] + c[i] * c[i] + b[i] * b[i] + c[i]; d[i] = a[i] + a[i+1]; } ---

[Bug middle-end/110148] [14 Regression] TSVC s242 regression between g:c0df96b3cda5738afbba3a65bb054183c5cd5530 and g:e4c986fde56a6248f8fbe6cf0704e1da34b055d8

2023-06-09 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148 cuilili changed: What|Removed |Added CC||lili.cui at intel dot com --- Comment #2

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2023-06-06 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #14 from cuilili --- This regression has been fixed with the commit below and we can close this ticket. https://gcc.gnu.org/g:1b9a5cc9ec08e9f239dd2096edcc447b7a72f64a

[Bug tree-optimization/110038] [14 Regression] ICE: in rewrite_expr_tree_parallel, at tree-ssa-reassoc.cc:5522 with --param=tree-reassoc-width=2147483647

2023-06-06 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038 --- Comment #5 from cuilili --- (In reply to Martin Jambor from comment #4) > So is this now fixed? Yes, the attachment case has been fixed.

[Bug tree-optimization/110038] [14 Regression] ICE: in rewrite_expr_tree_parallel, at tree-ssa-reassoc.cc:5522 with --param=tree-reassoc-width=2147483647

2023-05-30 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038 --- Comment #2 from cuilili --- (In reply to Richard Biener from comment #1) > Probably best to limit the values to reassoc-width by adding the > appropriate IntegerRange attribute in params.opt > > IntegerRange(0, 256) > > maybe? "rewrite_ex

[Bug target/104271] [12/13 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-11-27 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #12 from cuilili --- This regression caused by the store forwarding issue, we eliminate the redundant two pairs of loads and stores which have store forwarding issue by inlining. This regression has been fixed by https://gcc.gnu.

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2022-07-26 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 105493, which changed state. Bug 105493 Summary: [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718 https://gcc.gnu.org/bugzilla/show_bug.cgi?

[Bug target/105493] [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718

2022-07-26 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493 cuilili changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/105493] [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718

2022-05-05 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493 --- Comment #2 from cuilili --- (In reply to Richard Biener from comment #1) > Martin is currently re-benchmarking GCC 12 on AMD, so let's see if there's > anything left on those. AMD may not have this issue, Richard fixed AMD regression with t

[Bug target/105493] New: [12/13 Regression] x86_64 538.imagick_r 6% regressions and 2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718

2022-05-05 Thread lili.cui at intel dot com via Gcc-bugs
Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: lili.cui at intel dot com Target Milestone: --- Similar issue with https://gcc.gnu.org/bugzilla

[Bug target/104723] [12 regression] Redundant usage of stack

2022-04-24 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723 --- Comment #11 from cuilili --- (In reply to Jakub Jelinek from comment #10) > And for the backend, the question is how big the penalty for the overlapping > store is compared to doing multiple non-overlapping stores. Say for those > 49 bytes

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-04-15 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #9 from cuilili --- Really appreciate for your reply, I debugged SRA pass with the small testcase and found that SRA dose not handle this situation. SRA cannot split callee's first parameter for "Do not decompose non-BLKmode paramet

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-03-28 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #7 from cuilili --- Created attachment 52706 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52706&action=edit Add a heuristic for eliminate redundant load and store in inline pass. Hi Richard, Could you help take a look? This

[Bug target/104271] [12 Regression] 538.imagick_r run-time at -Ofast -march=native regressed by 26% on Intel Cascade Lake server CPU

2022-03-24 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271 --- Comment #6 from cuilili --- I created a patch to fix this regression. The patch is under performance testing. Will sent it out later.

[Bug target/104723] [12 regression] Redundant usage of stack

2022-03-02 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723 --- Comment #9 from cuilili --- (In reply to cuilili from comment #3) > (In reply to Hongtao.liu from comment #1) > > STF issue here? > correct comment #3 I used perf to collect the "ld_blocks.store_forward" event for those two test cases, stl

[Bug target/104723] [12 regression] Redundant usage of stack

2022-03-01 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723 --- Comment #3 from cuilili --- (In reply to Hongtao.liu from comment #1) > STF issue here? Yes, Since "YMMWORD PTR [rsp-72]" across the cache line, it has STLF issue here. vmovdqu64 YMMWORD PTR [rsp-72], ymm31 --> store 32 bytes from [rsp-7

[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2022-02-27 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #28 from cuilili --- (In reply to H.J. Lu from comment #25) > Can this be mitigated by removing redundant load and store? Yes, inlining say_sphere can remove redundant loads and stores, O3 does inlining, but O2 is more sensitive to c

[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2022-02-25 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 --- Comment #24 from cuilili --- (In reply to cuilili from comment #23) > (In reply to Richard Biener from comment #17) > > I do wonder though how CLX is fine with such access pattern ;) (did you > > test > > with just -O2?) > Sorry, correct

[Bug target/101908] [12 regression] cray regression with -O2 -ftree-slp-vectorize compared to -O2

2022-02-25 Thread lili.cui at intel dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908 cuilili changed: What|Removed |Added CC||lili.cui at intel dot com --- Comment #23

[Bug target/95621] New: Add CET(PTA_SHSTK) to march=tigerlake

2020-06-09 Thread lili.cui at intel dot com
Assignee: unassigned at gcc dot gnu.org Reporter: lili.cui at intel dot com Target Milestone: --- For intel TigerLake need support CET, add PTA_SHSTK to march=tigerlake.

[Bug target/95525] Bitmask conflict between PTA_AVX512VP2INTERSECT and PTA_WAITPKG

2020-06-04 Thread lili.cui at intel dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95525 cuilili changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/95525] New: Bitmask conflict between PTA_AVX512VP2INTERSECT and PTA_WAITPKG

2020-06-03 Thread lili.cui at intel dot com
Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: lili.cui at intel dot com Target Milestone: --- In gcc trunk, bitmask conflict between PTA_AVX512VP2INTERSECT and PTA_WAITPKG in gcc/config/i386/i386.h const wide_int_bitmask