> Hi. > > I'm suggesting to restrict LOOP_ALIGN to only loop headers. That are the > basic blocks for which it makes the biggest sense. I quite some binary > size reductions on SPEC2006 and SPEC2017. Speed numbers are also slightly > positive. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > Ready to be installed? The original idea of distinction between jump alignment and loop alignment was that they have two basic meanings: 1) jump alignment is there to avoid jumping just to the end of decode window (if the window is aligned) so CPU will get stuck after reaching the jump and also to possibly reduce code cache polution by populating by code that is not executed 2) loop alignment aims to fit loop in as few cache windows as possible
Now if you have loop laid in a way that header of loop is not first basic block, 2) IMO still apply. I.e. jump loop :loopback loop body :loop if cond jump to loopback So dropping loop alignment for those does not seem to make much sense from high level. We may want to have differnt alignment for loops starting by header and loops starting in the middle, but I still liked more your patch which did bundles for loops. modern x86 chips are not very good testing targets on it. I guess generic changes to alignment needs to be tested on other chips too. Honza > Thanks, > Martin > > gcc/ChangeLog: > > 2019-07-09 Martin Liska <mli...@suse.cz> > > * final.c (compute_alignments): Apply the LOOP_ALIGN only > to basic blocks that all loop headers. > --- > gcc/final.c | 1 + > 1 file changed, 1 insertion(+) > > > diff --git a/gcc/final.c b/gcc/final.c > index fefc4874b24..ce2678da988 100644 > --- a/gcc/final.c > +++ b/gcc/final.c > @@ -739,6 +739,7 @@ compute_alignments (void) > if (has_fallthru > && !(single_succ_p (bb) > && single_succ (bb) == EXIT_BLOCK_PTR_FOR_FN (cfun)) > + && bb->loop_father->header == bb > && optimize_bb_for_speed_p (bb) > && branch_count + fallthru_count > count_threshold > && (branch_count >