https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84481

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Priority|P3                          |P2
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2019-04-11
     Ever confirmed|0                           |1

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Martin Liška from comment #3)
> Interesting. Do I understand that correctly that it's due to increasing
> addresses of the 3 load instructions: 0x8(%rdx), 0x18(%rdx), 0x30(%rdx) vs.
> 0x18(%rdx) 0x30(%rdx) 0x8(%rdx) ?

I would guess that the hardware prefetcher might be sensitive to this.  But
note that depending on the frontend any two of the loads might issue in
parallel.

It seems this is some kind of list-walking so HW prefetching possibly
doesn't (and should not) trigger.

Anyways, it's probably a cache subsystem "issue".  Ordering memory
references might be an interesting post-reload scheduling heuristic
we could employ here.

Reply via email to