On Thu, Sep 24, 2015 at 12:32:59PM +0200, Bernd Schmidt wrote: > On 09/24/2015 12:06 AM, Segher Boessenkool wrote: > >This is the meat of this series: a new algorithm to do basic block > >reordering. It uses the simple greedy approach to maximum weighted > >matching, where the weights are the predicted execution frequency of > >the edges. This always finds a solution that is within a factor two > >of optimal, if you disregard loops (which we cannot allow) and the > >complications of block partitioning. > > Looks really good for the most part. > > The comment at the top of the file should be updated to mention both > algorithms.
Will do. > >+ /* Sort the edges, the most desirable first. */ > >+ > >+ std::stable_sort (edges, edges + n, edge_order); > > Any thoughts on this vs qsort? Do you need a stable sort? We always need stable sorts in GCC; things are not reproducible across targets with qsort (not every qsort is the same). Also, you sometimes have two edges back-to-back, with the same weights; a stable sort ensures we don't put a jump in the middle of that if we can help it. > >+ int j; > >+ for (j = 0; j < n; j++) > > for (int j ... > here and in the other loop that uses j. That is so ugly. Will change though :-) > >+ /* If the entry edge no longer falls through we have to make a new > >+ block so it can do so again. */ > >+ > >+ edge e = EDGE_SUCC (ENTRY_BLOCK_PTR_FOR_FN (cfun), 0); > >+ if (e->dest != ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux) > >+ { > >+ force_nonfallthru (e); > >+ e->src->aux = ENTRY_BLOCK_PTR_FOR_FN (cfun)->aux; > >+ BB_COPY_PARTITION (e->src, e->dest); > >+ } > >+} > > That's a bit odd, can this situation be prevented earlier? We could always add an extra jump at the start, but that's about the same code so not helpful. > Why wouldn't we force the entry edge to fall thru? Because it pessimises code. If the model thinks the first block after entry belongs somewhere in the middle of a fall-through sequence, it usually is right (typical examples are loops that start with the loop test). This algorithm does not peel loops (it does no duplication at all). All the optimisable blocks end with an unconditional jump, and this algo tries to remove as many of those as it can; logically, the start block has such a jump as well. Segher