http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57908

--- Comment #10 from Yann Droneaud <yann at droneaud dot fr> ---
(In reply to Andrew Pinski from comment #9)
> (In reply to Yann Droneaud from comment #8)
> > Could someone comment on which optimisation is achieved by aligning such
> > small arrays ?
> 
> The simple answer is so each array is more likely to fit into a cache line:
>    One use of this macro is to increase alignment of medium-size
>    data to make it all fit in fewer cache lines.  */

Thanks for the investigation.

Initially I thought it would be better to "pack" such arrays to fit whole cache
line: fewer cache lines will be used and most of the arrays would be already in
cache lines.

But according to http://stackoverflow.com/a/7281770:

"On x86 cache lines are 64 bytes, however, to prevent false sharing, you need
to follow the guidelines of the processor you are targeting (intel has some
special notes on its netburst based processors), generally you need to align to
64 bytes for this (intel states that you should also avoid crossing 16 byte
boundries)."

This start to make sense to me.

I'm likely buying the argument for global variables but for local variables, I
think they are probably not going to be shared a lot across CPUs. But I haven't
data for this so I won't continue that way.

Thanks a lot for answer my question.

Reply via email to