On Mon, Nov 9, 2015 at 4:55 AM, Richard Biener <rguent...@suse.de> wrote: > > Currently BB vectorization computes all dependences inside a BB > region and fails all vectorization if it cannot handle some of them. > > This is obviously not needed - BB vectorization can restrict the > dependence tests to those that are needed to apply the load/store > motion effectively performed by the vectorization (sinking all > participating loads/stores to the place of the last one). > > With restructuring it that way it's also easy to not give up completely > but only for the SLP instance we cannot vectorize (this gives > a slight bump in my SPEC CPU 2006 testing to 756 vectorized basic > block regions). > > But first and foremost this patch is to reduce the dependence analysis > cost and somewhat mitigate the compile-time effects of the first patch. > > For fixing PR56118 only a cost model issue remains. > > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk. > > Richard. > > 2015-11-09 Richard Biener <rguent...@suse.de> > > PR tree-optimization/56118 > * tree-vectorizer.h (vect_find_last_scalar_stmt_in_slp): Declare. > * tree-vect-slp.c (vect_find_last_scalar_stmt_in_slp): Export. > * tree-vect-data-refs.c (vect_slp_analyze_node_dependences): New > function. > (vect_slp_analyze_data_ref_dependences): Instead of computing > all dependences of the region DRs just analyze the code motions > SLP vectorization will perform. Remove SLP instances that > cannot have their store/load motions applied. > (vect_analyze_data_refs): Allow DRs without a vectype > in BB vectorization. >
This caused: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68492 H.J.