https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102294

--- Comment #29 from H.J. Lu <hjl.tools at gmail dot com> ---
(In reply to Mateusz Guzik from comment #28)
> (In reply to H.J. Lu from comment #27)
> > (In reply to Mateusz Guzik from comment #26)
> > > 4 stores per loop is best
> > 
> > Do you have data to show it?
> 
> I used to, but I'm out of this game.
> 
> However, this is what gcc is already emitting if you explicitly ask it for
> unrolled loops, so I don't think this bit should be controversial.

It is hard to believ 8 stores slower than a loop.

Reply via email to