Hello, On 19 May 09:58, H.J. Lu wrote: > On Mon, May 19, 2014 at 9:45 AM, Uros Bizjak <ubiz...@gmail.com> wrote: > > On Mon, May 19, 2014 at 6:42 PM, H.J. Lu <hjl.to...@gmail.com> wrote: > > > >>>> Uros, > >>>> I am looking into libreoffice size and the data alignment seems to make > >>>> huge > >>>> difference. Data section has grown from 5.8MB to 6.3MB in between GCC > >>>> 4.8 and 4.9, > >>>> while clang produces 5.2MB. > >>>> > >>>> The two patches I posted to not align vtables and RTTI reduces it to > >>>> 5.7MB, but > >>>> But perhaps we want to revisit the alignment rules. The optimization > >>>> manuals > >>>> usually care only about performance critical loops. Perhaps we can make > >>>> the > >>>> rules to align only bigger datastructures, or so at least for -O2. > >>> > >>> Based on the above quote, "Misaligned data access can incur > >>> significant performance penalties." and the fact that this particular > >>> alignment rule has some compatibility issues with previous versions of > >>> gcc (these were later fixed by Jakub), I'd rather leave this rule as > >>> is. However, if the access is from the cold section, we can perhaps > >>> avoid extra alignment, while avoiding those compatibility issues. > >>> > >> > >> It is excessive to align > >> > >> struct foo > >> { > >> int x1; > >> int x2; > >> char x3; > >> int x4; > >> int x5; > >> char x6; > >> int x7; > >> int x8; > >> }; > >> > >> to 32 bytes and align > >> > >> struct foo > >> { > >> int x1; > >> int x2; > >> char x3; > >> int x4; > >> int x5; > >> char x6; > >> int x7[9]; > >> int x8; > >> }; > >> > >> to 64 bytes. What performance gain does it provide? > > > > Avoids "significant performance penalties," perhaps? > > > > Kirill, do we have performance data for excessive alignment > vs ABI alignment? Nope, we have no actual data showing performance impact on such changes, sorry.
We may try such a change on HSW machine (on Spec 2006), will it be useful? --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -26576,7 +26576,7 @@ ix86_data_alignment (tree type, int align, bool opt) used to assume. */ int max_align_compat - = optimize_size ? BITS_PER_WORD : MIN (256, MAX_OFILE_ALIGNMENT); + = optimize_size ? BITS_PER_WORD : MIN (128, MAX_OFILE_ALIGNMENT); /* A data structure, equal or greater than the size of a cache line (64 bytes in the Pentium 4 and other recent Intel processors, including > -- > H.J. -- Thanks, K