: 15.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
Assignee: unassigned at gcc dot gnu.org
Reporter: lili.cui at intel dot com
Target Milestone: ---
Created attachment 59740
--> https://gcc.gnu.org/bugzi
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117192
--- Comment #14 from cuilili ---
(In reply to Uroš Bizjak from comment #12)
> Created attachment 59373 [details]
> Proposed patch
>
> Patch in testing.
Sorry, I made a mistake here, thanks!
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148
--- Comment #7 from cuilili ---
(In reply to Martin Jambor from comment #6)
> I believe this has been fixed?
Yes.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148
--- Comment #3 from cuilili ---
I reproduced S1244 regression on znver3.
Src code:
for (int i = 0; i < LEN_1D-1; i++)
{
a[i] = b[i] + c[i] * c[i] + b[i] * b[i] + c[i];
d[i] = a[i] + a[i+1];
}
---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110148
cuilili changed:
What|Removed |Added
CC||lili.cui at intel dot com
--- Comment #2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271
--- Comment #14 from cuilili ---
This regression has been fixed with the commit below and we can close this
ticket.
https://gcc.gnu.org/g:1b9a5cc9ec08e9f239dd2096edcc447b7a72f64a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038
--- Comment #5 from cuilili ---
(In reply to Martin Jambor from comment #4)
> So is this now fixed?
Yes, the attachment case has been fixed.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110038
--- Comment #2 from cuilili ---
(In reply to Richard Biener from comment #1)
> Probably best to limit the values to reassoc-width by adding the
> appropriate IntegerRange attribute in params.opt
>
> IntegerRange(0, 256)
>
> maybe?
"rewrite_ex
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271
--- Comment #12 from cuilili ---
This regression caused by the store forwarding issue, we eliminate the
redundant two pairs of loads and stores which have store forwarding issue by
inlining.
This regression has been fixed by
https://gcc.gnu.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163
Bug 26163 depends on bug 105493, which changed state.
Bug 105493 Summary: [12/13 Regression] x86_64 538.imagick_r 6% regressions and
2% 525.x264_r regressions on Alder Lake after r12-7319-g90d693bdc9d718
https://gcc.gnu.org/bugzilla/show_bug.cgi?
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493
cuilili changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105493
--- Comment #2 from cuilili ---
(In reply to Richard Biener from comment #1)
> Martin is currently re-benchmarking GCC 12 on AMD, so let's see if there's
> anything left on those.
AMD may not have this issue, Richard fixed AMD regression with t
Version: 13.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: lili.cui at intel dot com
Target Milestone: ---
Similar issue with https://gcc.gnu.org/bugzilla
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #11 from cuilili ---
(In reply to Jakub Jelinek from comment #10)
> And for the backend, the question is how big the penalty for the overlapping
> store is compared to doing multiple non-overlapping stores. Say for those
> 49 bytes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271
--- Comment #9 from cuilili ---
Really appreciate for your reply, I debugged SRA pass with the small testcase
and found that SRA dose not handle this situation.
SRA cannot split callee's first parameter for "Do not decompose non-BLKmode
paramet
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271
--- Comment #7 from cuilili ---
Created attachment 52706
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52706&action=edit
Add a heuristic for eliminate redundant load and store in inline pass.
Hi Richard,
Could you help take a look? This
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104271
--- Comment #6 from cuilili ---
I created a patch to fix this regression. The patch is under performance
testing. Will sent it out later.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #9 from cuilili ---
(In reply to cuilili from comment #3)
> (In reply to Hongtao.liu from comment #1)
> > STF issue here?
>
correct comment #3
I used perf to collect the "ld_blocks.store_forward" event for those two test
cases, stl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #3 from cuilili ---
(In reply to Hongtao.liu from comment #1)
> STF issue here?
Yes, Since "YMMWORD PTR [rsp-72]" across the cache line, it has STLF issue
here.
vmovdqu64 YMMWORD PTR [rsp-72], ymm31 --> store 32 bytes from [rsp-7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908
--- Comment #28 from cuilili ---
(In reply to H.J. Lu from comment #25)
> Can this be mitigated by removing redundant load and store?
Yes, inlining say_sphere can remove redundant loads and stores, O3 does
inlining, but O2 is more sensitive to c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908
--- Comment #24 from cuilili ---
(In reply to cuilili from comment #23)
> (In reply to Richard Biener from comment #17)
> > I do wonder though how CLX is fine with such access pattern ;) (did you
> > test
> > with just -O2?)
>
Sorry, correct
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101908
cuilili changed:
What|Removed |Added
CC||lili.cui at intel dot com
--- Comment #23
Assignee: unassigned at gcc dot gnu.org
Reporter: lili.cui at intel dot com
Target Milestone: ---
For intel TigerLake need support CET, add PTA_SHSTK to march=tigerlake.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95525
cuilili changed:
What|Removed |Added
Resolution|--- |FIXED
Status|NEW
Priority: P3
Component: target
Assignee: unassigned at gcc dot gnu.org
Reporter: lili.cui at intel dot com
Target Milestone: ---
In gcc trunk, bitmask conflict between PTA_AVX512VP2INTERSECT and PTA_WAITPKG
in gcc/config/i386/i386.h
const wide_int_bitmask
25 matches
Mail list logo