https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68707
--- Comment #24 from rguenther at suse dot de <rguenther at suse dot de> --- On Tue, 22 Dec 2015, alalaw01 at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68707 > > --- Comment #23 from alalaw01 at gcc dot gnu.org --- > Yes, difficult. I'm conscious that this is stage 3, and worried about adding > too much complexity, especially if we're writing code that we'd eventually > drop > in favour of a more complete framework later (i.e. in gcc7). > > I'm inclined against > > > (I wondered > > if load-lanes would require more unrolling we should prefer SLP anyway?). > > As we've seen cases where load-lanes requires more unrolling but the code is > still much better. Likewise your argument against > > > to query whether _all_ loads need to be permuted with SLP > ... > > thus if there is a load node which is not permuted then retain the SLP. > > seems convincing. I think the heuristic in comment 16 handles permutation well > enough, and beyond that, sharing (rather than the permutation) then appears to > be the critical factor. Unfortunately as you say SLP doesn't really handle > sharing yet...so > > > I fear that to get a better heuristic > > than what is proposed we need to push this for example to > > vect_make_slp_decision where all instances are built > > Might be reasonable, but I fear it'd be of dubious benefit without: > > > and we'd need to gather some sharing data therein. > > I guess if that were a useful step towards > > > But then there is only a small step to the point where we could actually > > compare SLP vs. non-SLP costs. > > then there is some justification, but the former feels like too much > complexity > at this stage - especially to do it well; how much do we really want to gather > data on the sharing that exists at present, rather than looking at removing > that sharing entirely? I'm thinking of e.g. SLP nodes that are performing the > same computations but with different permutations too - shouldn't we be aiming > at making permutations into first class citizens/operations, and making SLP > trees into DAGs? Longer-term goals, sure... > > So my instinct is to go with the comment 16 patch, and accept that we take the > hit in that last testcase (i.e. the one with the sharing). Works for me - can you get the patch "aprroved" on the ml then?