https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #12 from Jakub Jelinek ---
The overlapping stores happen due to TARGET_OVERLAP_OP_BY_PIECES_P returning
true since PR90773.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #11 from cuilili ---
(In reply to Jakub Jelinek from comment #10)
> And for the backend, the question is how big the penalty for the overlapping
> store is compared to doing multiple non-overlapping stores. Say for those
> 49 bytes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #10 from Jakub Jelinek ---
(In reply to H.J. Lu from comment #8)
> > DSE can remove redundant load/store for TI, but not OI/XI.
DSE can remove redundant load/store for OI/XI just fine, just remove the last 7
from the string so that
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #9 from cuilili ---
(In reply to cuilili from comment #3)
> (In reply to Hongtao.liu from comment #1)
> > STF issue here?
>
correct comment #3
I used perf to collect the "ld_blocks.store_forward" event for those two test
cases, stl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #8 from H.J. Lu ---
(In reply to H.J. Lu from comment #7)
> (In reply to Jakub Jelinek from comment #6)
> > Started with r12-2666-g29f0e955c97da002b5adb4e8c9dfd2ea9709e207
>
> DSE can remove redundant load/store for TI, but not OI/X
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #7 from H.J. Lu ---
(In reply to Jakub Jelinek from comment #6)
> Started with r12-2666-g29f0e955c97da002b5adb4e8c9dfd2ea9709e207
DSE can remove redundant load/store for TI, but not OI/XI.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
Jakub Jelinek changed:
What|Removed |Added
Keywords|needs-bisection |
--- Comment #6 from Jakub Jelinek ---
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
Jakub Jelinek changed:
What|Removed |Added
CC||jakub at gcc dot gnu.org
--- Comment #5
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
Richard Biener changed:
What|Removed |Added
Ever confirmed|0 |1
Last reconfirmed|
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #3 from cuilili ---
(In reply to Hongtao.liu from comment #1)
> STF issue here?
Yes, Since "YMMWORD PTR [rsp-72]" across the cache line, it has STLF issue
here.
vmovdqu64 YMMWORD PTR [rsp-72], ymm31 --> store 32 bytes from [rsp-7
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #2 from Hongtao.liu ---
update testcase
void f256(char *a)
{
char t[] = "012345678901234567890123456789012345678901234567";
__builtin_memcpy(a, &t[0], sizeof(t));
}
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #1 from Hongtao.liu ---
(In reply to Hongtao.liu from comment #0)
> bool f256(char *a)
> {
> char t[] = "012345678901234567890123456789012345678901234567";
> return __builtin_memcpy(a, &t[0], sizeof(t)) == 0;
> }
>
> https://god
12 matches
Mail list logo