On Fri, 1 Apr 2016, Jakub Jelinek wrote: > On Fri, Apr 01, 2016 at 11:08:09AM +0200, Richard Biener wrote: > > > > RTL DSE uses true_dependence to see whether a store may be killed by > > anothe store - that's obviously broken. The following patch makes > > it use output_dependence instead (introducing a canon_ variant of that). > > I think it would be interesting to see some stats on what effect does this > have on the optimization RTL DSE is doing (say gather during > unpatched bootstrap/regtest number of successfully optimized replace_read > calls, and the same with patched bootstrap/regtest). > From quick look at the patch, this wouldn't optimize even the cases that > could be optimized (return *pi;) at the RTL level. If the statistics would > show this affects it significantly, perhaps we could do both > canon_true_dependence and canon_output_dependence, and if the two calls > differ, don't clear the rhs, but mark it somehow and then in replace_read > check what alias set is used for the read or something similar?
Well, I don't believe it is DSEs job to do CSE. And I don't see how we can efficiently do what you suggest - it seems DSE doesn't check all possible aliases when CSEing. But of course I didn't try to understand how it works and thus happily defer to somebody else coming up with a better fix. Maybe the correct fix is to while (i_ptr) { bool remove = false; store_info *store_info = i_ptr->store_rec; /* Skip the clobbers. */ while (!store_info->is_set) store_info = store_info->next; ... /* If this read is just reading back something that we just stored, rewrite the read. */ else { if (store_info->rhs && offset >= store_info->begin && offset + width <= store_info->end && all_positions_needed_p (store_info, offset - store_info->begin, width) && replace_read (store_info, i_ptr, read_info, insn_info, loc, bb_info->regs_live)) return; /* The bases are the same, just see if the offsets overlap. */ if ((offset < store_info->end) && (offset + width > store_info->begin)) remove = true; which lacks sth like a canon_true_dependence check so that if replace_read fails it doesn't try again with other stores it finds (hopefully the store_info->next list is "properly" ordered - you'd want to walk backwards starting from the reads - I don't think this is what the code above does). So it really seems to be that the code I fixed (quoted again below) is supposed to ensure the validity of the transform as the comment suggests. else if (s_info->rhs) /* Need to see if it is possible for this store to overwrite the value of store_info. If it is, set the rhs to NULL to keep it from being used to remove a load. */ { if (canon_output_dependence (s_info->mem, true, mem, GET_MODE (mem), mem_addr)) { s_info->rhs = NULL; s_info->const_rhs = NULL; } As said, I don't believe in a DSE algorithm doing CSE, so ... Richard.