On 4/24/19 4:44 AM, Richard Biener wrote:
> On Tue, Apr 23, 2019 at 4:29 PM Jeff Law <l...@redhat.com> wrote:
>>
>>
>> As discussed in the BZ, this patch addresses the false positive warning
>> by cleaning up the const/copy propagations left in the IL between DOM's
>> jump threading and erroneous path isolation.
>>
>> In the past we'd been handling this stuff with phi only cprop.  To make
>> phi only cprop work in this case we'd have to change it to scan
>> statements within some subset of blocks where it had previously only
>> scanned PHIs.
>>
>> I was concerned about the compile-time cost of the additional scanning
>> plus the extra pass.  So I compared that against using the lattice based
>> const/copy propagation as well as against the RPO VN bits.
>>
>> It turns out that the lattice based const/copy propagation does the
>> least amount of work, closely followed by a limited RPO VN.  The
>> improved "phi only" copy propagator being the worst.
> 
> Interesting.
Definitely.  I didn't dig further than what's mentioned in the BZ.
> 
>> Given that we can use the lattice copy propagator by just adding the
>> pass to passes.def whereas using the RPN VN actually requires a little
>> bit of real code (to set up the entry/exits for the relevant SEME
>> regions), I went with the lattice copy propagator.
>>
>> This change adds around .4% instruction executions to my testbed of .i
>> files.  It has no significant impact on the resulting code -- I see
>> different register allocation decisions in a lot of places which seem to
>> primarily result in reversing arguments to comparisons.
> 
> Was there a need to have two copy-prop passes in the early
> DOM/errorneous-path removal where we previously only had
> a single phi-only-prop pass?  Is the testcase fixed also when
> doing copy-prop only a single time?
The testcase is fixed with a single copyprop (lattice or RPO VN) after
DOM but before erroneous path isolation.   I seriously considered just
dropping the copyprop pass after erroneous path isolation.  I'm pretty
sure it'll regress codegen quality, but it may not be too bad.


> 
> Also the late pass you replace should be right after VRP, not
> after warn_restrict (that was a mistake done when adding the
> warn_restrict pass).
Agreed.  But ISTM we should make that an independent change.

> 
> The main reason I dislike this is that it is an unconditional cleanup
> pass run even when we didn't perform any jump threading.  That's
> of course the same case as with the existing phi-only-prop passes
> but would have been my major argument for the SEME VN
> (where at some cut-off of course doing a single whole-function VN
> is cheaper than doing N SEME VNs, and I can even think of doing
> N SEME regions at once in exchange for doing a whole-function
> RPO order compute).
Yea.  We're certainly doing more work in the cases where we didn't
thread anything.  I've always wanted better ways to indicate what
actions a pass did and using that to bypass subsequent passes if they
weren't going to be profitable.




> 
>> FWIW I also considered delaying the erroneous path isolation pass.  I
>> ultimately decided against that.  My recollection is that its location
>> came from the desire to clean up those paths early enough in the
>> pipeline so that other optimizers could do a better job.
>>
>> We could consider an early/late split here.  Essentially we do the
>> optimization early, leaving enough data lying around somewhere for a
>> late pass to look for and issue the warning.  Something like a
>> __builtin_warning that we leave in the IL, then just before expanding to
>> RTL, we scan the IL once and issue the __builtin_warnings.
>>
>> In this specific case the __builtin_warning would be on a path that we
>> eventually would determine was unreachable at compile time and the path
>> would be removed.  I suspect we could do something similar for other
>> warnings coming out of the middle end.
>>
>> Anyway, this has been bootstrapped and regression tested on x86_64,
>> ppc64, i686, sparc64, & ppc64le.  It's also been bootstrapped on alpha,
>> ppc (32 bit), armeb, m68k, riscv64, mipsisa32r4, arm, sh4, & sh4eb.
>> It's also built and regression tested all the *-elf targets in my tester.
>>
>> OK for the trunk, or do we want to defer to gcc-10?
> 
> I like the pass removal and would say OK if you manage with
> a 1:1 replacement (thus get rid of that extra copy-prop between
> DOM and pass_isolate_erroneous_paths).
That's the one we need to fix the regression.  We might be able to drop
the one after erroneous path isolation which would keep us at the 1:1
replacement.  I'll poke at that.

jeff

Reply via email to