The e500v2 (SPE) hardware is such that if the address of vector (double world load / stores) are not double world aligned the instruction will trap.
So this alignment is not optional. Edmar On Fri, Jun 7, 2013 at 3:43 PM, Richard Henderson <r...@redhat.com> wrote: > On 06/07/2013 12:25 PM, Jakub Jelinek wrote: >> This PR is about DATA_ALIGNMENT macro increasing alignment of some decls >> for optimization purposes beyond ABI mandated levels. It is fine to emit >> the vars aligned as much as we want for optimization purposes, but if we >> can't be sure that references to that decl bind to the definition we >> increased the alignment on (e.g. common variables, or -fpic code without >> hidden visibility, weak vars etc.), we can't assume that alignment. > > When the linker merges common blocks, it chooses both maximum size and maximum > alignment. Thus for any common block for which we can prove the block must > reside in the module (any executable, or hidden common in shared object), we > can go ahead and use the increased alignment. > > It's only in shared objects with non-hidden common blocks that we have a > problem, since in that case we may resolve the common block via copy reloc to > a > memory block in another module. > > So while decl_binds_to_current_def_p is a good starting point, I think we can > do a little better with common blocks. Which ought to take care of those > vectorization regressions you mention. > >> @@ -966,8 +966,12 @@ align_variable (tree decl, bool dont_out >> align = MAX_OFILE_ALIGNMENT; >> } >> >> - /* On some machines, it is good to increase alignment sometimes. */ >> - if (! DECL_USER_ALIGN (decl)) >> + /* On some machines, it is good to increase alignment sometimes. >> + But as DECL_ALIGN is used both for actually emitting the variable >> + and for code accessing the variable as guaranteed alignment, we >> + can only increase the alignment if it is a performance optimization >> + if the references to it must bind to the current definition. */ >> + if (! DECL_USER_ALIGN (decl) && decl_binds_to_current_def_p (decl)) >> { >> #ifdef DATA_ALIGNMENT >> unsigned int data_align = DATA_ALIGNMENT (TREE_TYPE (decl), align); >> @@ -988,12 +992,69 @@ align_variable (tree decl, bool dont_out >> } >> #endif >> } >> +#ifdef DATA_ABI_ALIGNMENT >> + else if (! DECL_USER_ALIGN (decl)) >> + { >> + unsigned int data_align = DATA_ABI_ALIGNMENT (TREE_TYPE (decl), >> align); >> + /* For backwards compatibility, don't assume the ABI alignment for >> + TLS variables. */ >> + if (! DECL_THREAD_LOCAL_P (decl) || data_align <= BITS_PER_WORD) >> + align = data_align; >> + } >> +#endif > > This structure would seem to do the wrong thing if DATA_ABI_ALIGNMENT is > defined, but DATA_ALIGNMENT isn't. And while I realize you documented it, I > don't like the restriction that D_A /must/ return something larger than D_A_A. > All that means is that in complex cases D_A will have to call D_A_A itself. > > I would think that it would be better to rearrange as > > if (!D_U_A) > { > #ifdef D_A_A > align = ... > #endif > #ifdef D_A > if (d_b_t_c_d_p) > align = ... > #endif > } > > Why the special case for TLS? If we really want that special case surely that > test should go into D_A_A itself, and not here in generic code. > >> Bootstrapped/regtested on x86_64-linux and i686-linux. No idea about other >> targets, I've kept them all using DATA_ALIGNMENT, which is considered >> optimization increase only now, if there is some ABI mandated alignment >> increase on other targets, that should be done in DATA_ABI_ALIGNMENT as >> well as DATA_ALIGNMENT. > > I've had a brief look over the instances of D_A within the tree atm. Most of > them carry the cut-n-paste comment "for the same reasons". These I believe > never intended an ABI change, and were really only interested in optimization. > > But these I think require a good hard look to see if they really intended an > ABI alignment: > > c6x comment explicitly mentions abi > cris compiler options for alignment -- systemwide or local? > mmix comment mentions GETA instruction > s390 comment mentions LARL instruction > rs6000 SPE and E500 portion of the alignment non-optional? > > Relevant port maintainers CCed. > > > r~