https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104723
--- Comment #3 from cuilili <lili.cui at intel dot com> --- (In reply to Hongtao.liu from comment #1) > STF issue here? Yes, Since "YMMWORD PTR [rsp-72]" across the cache line, it has STLF issue here. vmovdqu64 YMMWORD PTR [rsp-72], ymm31 --> store 32 bytes from [rsp-72], across cache line vmovdqu64 YMMWORD PTR [rsp-55], ymm31 --> over write part of YMMWORD PTR [rsp-72] vmovdqu64 ymm31, YMMWORD PTR [rsp-72] --> STLF with first instruction and has penalty.