On Fri, 1 Apr 2016, Jakub Jelinek wrote:

> On Fri, Apr 01, 2016 at 11:08:09AM +0200, Richard Biener wrote:
> > 
> > RTL DSE uses true_dependence to see whether a store may be killed by
> > anothe store - that's obviously broken.  The following patch makes
> > it use output_dependence instead (introducing a canon_ variant of that).
> 
> I think it would be interesting to see some stats on what effect does this
> have on the optimization RTL DSE is doing (say gather during
> unpatched bootstrap/regtest number of successfully optimized replace_read
> calls, and the same with patched bootstrap/regtest).
> From quick look at the patch, this wouldn't optimize even the cases that
> could be optimized (return *pi;) at the RTL level.  If the statistics would
> show this affects it significantly, perhaps we could do both
> canon_true_dependence and canon_output_dependence, and if the two calls
> differ, don't clear the rhs, but mark it somehow and then in replace_read
> check what alias set is used for the read or something similar?

Well, I don't believe it is DSEs job to do CSE.  And I don't see how
we can efficiently do what you suggest - it seems DSE doesn't check
all possible aliases when CSEing.

But of course I didn't try to understand how it works and thus happily
defer to somebody else coming up with a better fix.  Maybe the correct
fix is to

      while (i_ptr)
        {
          bool remove = false;
          store_info *store_info = i_ptr->store_rec;

          /* Skip the clobbers.  */
          while (!store_info->is_set)
            store_info = store_info->next;
...

              /* If this read is just reading back something that we just
                 stored, rewrite the read.  */
              else
                {
                  if (store_info->rhs
                      && offset >= store_info->begin
                      && offset + width <= store_info->end
                      && all_positions_needed_p (store_info,
                                                 offset - 
store_info->begin,
                                                 width)
                      && replace_read (store_info, i_ptr, read_info,
                                       insn_info, loc, 
bb_info->regs_live))
                    return;

                  /* The bases are the same, just see if the offsets
                     overlap.  */
                  if ((offset < store_info->end)
                      && (offset + width > store_info->begin))
                    remove = true;

which lacks sth like a canon_true_dependence check so that if
replace_read fails it doesn't try again with other stores it finds
(hopefully the store_info->next list is "properly" ordered - you'd
want to walk backwards starting from the reads - I don't think this
is what the code above does).

So it really seems to be that the code I fixed (quoted again below)
is supposed to ensure the validity of the transform as the comment
suggests.

      else if (s_info->rhs)
        /* Need to see if it is possible for this store to overwrite
           the value of store_info.  If it is, set the rhs to NULL to
           keep it from being used to remove a load.  */
        {
          if (canon_output_dependence (s_info->mem, true,
                                       mem, GET_MODE (mem),
                                       mem_addr))
            {
              s_info->rhs = NULL;
              s_info->const_rhs = NULL;
            }

As said, I don't believe in a DSE algorithm doing CSE, so ...

Richard.

Reply via email to