On Wed, 25 Apr 2012, Aldy Hernandez wrote: > On 04/25/12 06:45, Richard Guenther wrote: > > On Tue, Apr 24, 2012 at 7:43 PM, Aldy Hernandez<al...@redhat.com> wrote: > > > On 04/13/12 03:46, Richard Guenther wrote: > > > > > > > > On Fri, Apr 13, 2012 at 12:11 AM, Aldy Hernandez<al...@redhat.com> > > > > wrote: > > > + /* ?? Perhaps we should cache this somewhere in the BB, or are > > + multiple levels of empty BB's common. ?? */ > > + FOR_EACH_EDGE (e, ei, bb->preds) > > + { > > + int res = bb_in_transaction_1 (e->src); > > + if (res != -1) > > + return (bool) res; > > + } > > + return false; > > > > that's weird code - if predecessors are not all in or not in a transaction > > you return a random value. So it seems this is somewhat fragile. > > > > If bb status can be derived from looking at a single stmt why is the > > transaction-ness not recorded as a basic-block flag then? Thus, > > why do we have a bit in gimple stmts? > > How about we get rid of the GIMPLE bit altogether and keep it in the BB flags > like you mention? See patch.
Yeah, that looks good. > > + bool single_exit_p = single_pred_p (ex->dest); > > > > that's a strange variable name ;) > > How about loop_has_only_one_exit? :) Fine ;) > > > > +/* Helper function for execute_sm. On every path that sets REF, set > > + an appropriate flag indicating the store. */ > > + > > +static tree > > +execute_sm_if_changed_flag_set (struct loop *loop, mem_ref_p ref) > > +{ > > ... > > + FOR_EACH_VEC_ELT (mem_ref_loc_p, locs, i, loc) > > + { > > + gimple_stmt_iterator gsi; > > + gimple stmt; > > + > > + gsi = gsi_start_bb (gimple_bb (loc->stmt)); > > + for (; !gsi_end_p (gsi); gsi_next (&gsi)) > > + if (gsi_stmt (gsi) == loc->stmt) > > > > does not seem to do it on every path but on every REF setter. And instead > > of the innermost loop just use gsi_for_stmt (loc->stmt). > > Updated comment. Fixed. > > > > > + /* Emit the load code into the latch, so that we are sure it will > > + be processed after all dependencies. */ > > + latch_edge = loop_latch_edge (loop); > > + load = gimple_build_assign (mem_ssa, unshare_expr (ref->mem)); > > lim_data = init_lim_data (load); > > lim_data->max_loop = loop; > > lim_data->tgt_loop = loop; > > + gsi_insert_on_edge (latch_edge, load); > > + load = gimple_build_assign (tmp_var, mem_ssa); > > + lim_data = init_lim_data (load); > > + lim_data->max_loop = loop; > > + lim_data->tgt_loop = loop; > > + gsi_insert_on_edge (latch_edge, load); > > > > the 2nd assign is no longer a "load", so I suppose you don't need any > > lim_data > > for it? > > Re-did all this. See patch. > > > > > + else if (store == STORE_IF_CHANGED_FLAG) > > + execute_sm_if_changed (ex, ref->mem, mem_ssa, tmp_var, > > + store_flag); > > > > and this can pass NULL instead of mem_ssa? > > Got rid of mem_ssa altogether, as well as the if-changed approach which proved > inferior to the flags based approach. Good. > > Overall this looks reasonable. With also handling trapping things I meant > > that we should be able to apply store-motion to > > > > int a[256]; > > void foo (int *p) > > { > > int i; > > for (i= 0;i<256;i++) > > if (flag) > > *p = a[i]; > > } > > > > but we cannot, because the transform > > > > lsm = *p; > > for (i = 0; i< 256; ++i) > > if (flag) > > lsm = a[i]; > > *p = lsm; > > > > even when the store is properly conditional the load is not (and you do not > > change that). The unconditional load may trap here. What I meant to say > > is that we do not need the load at all unless there is also a load involved > > inside the loop - if we use the flag-based approach. If there is also a > > load > > inside the loop we'd need to perform the load there (or not handle this > > case) > > Ahh! Yes, this sounds very profitable. Would you mind me doing this in a > separate patch? I am already handling loads in this incarnation, so this > should be straightforward. No, it's fine to do it separately. > Speak of loads, I am keeping the information as an additional bitmap in > `memory_accesses', as ->refs_in_loop was set for stores as well, so I couldn't > depend on it. Let me know if you have another idea. Hmm, refs_in_loop & ~all_refs_stored_in_loop, so instead of + bitmap reads = VEC_index (bitmap, memory_accesses.reads_in_loop, + loop->num); + ref_read_in_loop_p = bitmap_bit_p (reads, ref->id); ref_read_in_loop_p = bitmap_bit_p (refs, ref->id) && !bitmap_bit_p (stores, ref->id); ? But maybe that doesn't work if a ref is both read and stored to. Btw, rather than adding a bitmap to memory_accesses I'd rather add a mark_ref_loaded corresponding to mark_ref_stored (or rather merge both into one) and a bitmap to struct mem_ref. > Mildly tested. Wadaya think? Looks good overall, minus the detail of remembering refs loaded. Richard. > Thanks again. > Aldy > -- Richard Guenther <rguent...@suse.de> SUSE / SUSE Labs SUSE LINUX Products GmbH - Nuernberg - AG Nuernberg - HRB 16746 GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer