On 04/01/2016 03:44 AM, Richard Biener wrote:
On Fri, 1 Apr 2016, Jakub Jelinek wrote:

On Fri, Apr 01, 2016 at 11:08:09AM +0200, Richard Biener wrote:

RTL DSE uses true_dependence to see whether a store may be killed by
anothe store - that's obviously broken.  The following patch makes
it use output_dependence instead (introducing a canon_ variant of that).

I think it would be interesting to see some stats on what effect does this
have on the optimization RTL DSE is doing (say gather during
unpatched bootstrap/regtest number of successfully optimized replace_read
calls, and the same with patched bootstrap/regtest).
 From quick look at the patch, this wouldn't optimize even the cases that
could be optimized (return *pi;) at the RTL level.  If the statistics would
show this affects it significantly, perhaps we could do both
canon_true_dependence and canon_output_dependence, and if the two calls
differ, don't clear the rhs, but mark it somehow and then in replace_read
check what alias set is used for the read or something similar?

Well, I don't believe it is DSEs job to do CSE.  And I don't see how
we can efficiently do what you suggest - it seems DSE doesn't check
all possible aliases when CSEing.
IIRC, there's nowhere else we're making this kind of transformation and it fits in fairly naturally in the formulations I've looked at in the past.

It may also be the case that this was carried forward from the old DSE code to prevent regressions. I'd have to do a lot more digging in the archives to know for sure -- what's in dse.c doesn't look anything like the old DSE implementation I was familiar with.

I've often speculated that all this stuff should be rewritten using the scheduler's dependency framework. Essentially when store gets scheduled we ought to be able to identify the potential set of dead stores, redundant loads and potential aliasing loads/stores as the store's dependencies are resolved.

I wouldn't suggest doing it as part of the scheduling pass, but instead as a separate pass that utilizes all the scheduler's dependency analysis, queue management, etc.

Any regressions (in terms of stores not eliminated or loads not CSE'd) would likely be inaccurate (overly conservative) dataflow build by the scheduler and fixing those would help both the DSE bits as well as scheduling in general.


I'd first look to fix the CSE-ish transformation, but if that proves difficult/expensive, then we'd be looking at removal. As part of the removal we should extract some testcases where it correctly fired and create a regression bug with those testcases.

Jeff

Reply via email to