On Fri, Aug 30, 2013 at 11:34 PM, Xinliang David Li <davi...@google.com> wrote: > On Fri, Aug 30, 2013 at 1:23 AM, Richard Biener > <richard.guent...@gmail.com> wrote: >> On Fri, Aug 30, 2013 at 1:28 AM, Xinliang David Li <davi...@google.com> >> wrote: >>> I was debugging a runtime failure of SPEC06 xalancbmk built with LIPO. >>> Using -fdisable-<pass> option pinpoints the problem in slp vectorize >>> pass on a particular function. dbgcnt support is added to to track >>> down the individual BB, but it fails even when the dbg count is set >>> to 0. >>> >>> It turns out that no BB was actually vectorized for that function, but >>> turning on/off slp-vectorize does make a difference in generated code >>> -- the only difference between the good and bad case is stack layout. >>> The problem is in the alignment analysis phase -- which >>> speculatively changes the base declaration's alignment regardless >>> whether the vectorization transformation will be performed or not >>> later. >>> >>> The attached patch fixes the problem. Testing is ok. Ok for trunk? >> >> Not in this form. I'd rather not put extra fields in the data-refs this way. >> (As for the xalancbmk runtime problem - doesn't this patch just paper >> over the real issue?) > > I believe it is stack-limit related. This program has some recursive > call chains that can generate a call frame ~9k deep. The vectorizer > side effect causes the affected function in the call frame to grow > ~100 byte in stack size. Since this function appears lots of times in > the callstack, the overall stack consumption increase a lot. Combined > with the aggressive cross module inlining, it ends up blowing up the > stack limit. > > >> >> For BB SLP you still adjust DR bases that do not take part in >> vectorization - the DR vector contains all refs in the BB, not just >> those in the SLP instances we are going to vectorize. So the >> commit of the re-aligning decision should be done from where >> we vectorize the DR, in vectorizable_load/store in its transform >> phase. >> >> If we decide to integrate the fields into the data-ref then the >> analysis and commit parts should go into the data-ref machinery >> as well. Otherwise the vectorizer should use data-ref->aux or some >> other way to hang off its private data. >> > > Good point. > >> Other than that, modifying alignment of variables that are not >> vectorized is indeed a bug worth fixing. > > The new version of the patch is attached. Ok for trunk after testing?
+/* A helper function to free data refs. */ + +void +destroy_datarefs (loop_vec_info loop_vinfo, bb_vec_info bb_vinfo) please rename to vect_destroy_datarefs @@ -431,6 +460,7 @@ static unsigned int execute_vect_slp (void) { basic_block bb; + bb_vec_info bb_vinfo; init_stmt_vec_info_vec (); @@ -438,8 +468,11 @@ execute_vect_slp (void) { vect_location = find_bb_location (bb); - if (vect_slp_analyze_bb (bb)) + if ((bb_vinfo = vect_slp_analyze_bb (bb))) { spurious change? bb_vinfo seems to be unused. +typedef struct _dataref_aux { + tree base_decl; + bool base_misaligned; + int misalignment; +} dataref_aux; we no longer need that typedef stuff with C++ ... + gcc_assert (dr->aux); + ((dataref_aux *)dr->aux)->base_decl = base; + ((dataref_aux *)dr->aux)->base_misaligned = true; dereferencing dr->aux will trap, so no need to assert (dr->aux). Please add a comment before this code explaining what this is for. -vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, - slp_tree slp_node) +vectorizable_store_1 (gimple stmt, stmt_vec_info stmt_info, + struct data_reference *dr, + gimple_stmt_iterator *gsi, gimple *vec_stmt, + slp_tree slp_node) ... +/* A wrapper function to vectorizable_store. */ + +static bool +vectorizable_store (gimple stmt, gimple_stmt_iterator *gsi, gimple *vec_stmt, + slp_tree slp_node) +{ it shouldn't be necessary to split the functions, after if (!vec_stmt) /* transformation not required. */ { STMT_VINFO_TYPE (stmt_info) = store_vec_info_type; vect_model_store_cost (stmt_info, ncopies, store_lanes_p, dt, NULL, NULL, NULL); return true; } /** Transform. **/ the function is no longer allowed to fail, so at this point you can call ensure_base_align. Similar for the load case. Ok with those minor cases fixed. Thanks, Richard. > thanks, > > David > >> >> Richard. >> >>> thanks, >>> >>> David