On 9/8/21 3:49 PM, Richard Biener wrote:

On Wed, Sep 8, 2021 at 3:25 PM Aldy Hernandez <al...@redhat.com> wrote:

It would be helpful to have the patch causing the issue to look at the IL.
But as Micha said, there needs to be a perfect loop nest for interchange
to work.

Richard.

Absolutely!  I'm attaching the reduced testcase, as well as the patch.

The problematic thread shows up in the thread2 dump:

Checking profitability of path (backwards):  bb:3 (4 insns) bb:9 (0
insns) bb:5
    Control statement insns: 2
    Overall: 2 insns
    Registering FSM jump thread: (5, 9) incoming edge;  (9, 3)  (3, 8)
nocopy; (3, 8)

Well, so as Micha said, the threading destroys the outer loop by
making it have two entries - we actually _do_ recover from this
but it results in a non-empty latch of the outer loop and thus
the loop is no longer a perfect nest.  The interchange pass has
special-sauce to recover from the loop store motion applied
but not to handle the kind of loop rotation the jump threading
caused.

Understood. The backedge into BB8 and the non-empty latch seem to cause problems.


The forward threader guards against this by simply disallowing
threadings that involve different loops.  As I see

The thread in question (5->9->3) is all within the same outer loop, though. BTW, the backward threader also disallows threading across different loops (see path_crosses_loops variable).

the threading done here should be 7->3 (outer loop entry) to bb 8
rather than one involving the backedge.  Also note the condition
is invariant in the loop and in fact subsumed by the condition
outside of the loop and it should have been simplified by
VRP after pass_ch but I see there's a jump threading pass
inbetween pass_ch and the next VRP which is likely the
problem.

A 7->3->8 thread would cross loops though, because 7 is outside the outer loop.


Why does jump threading not try to simplify a condition before
attempting to thread through it?  After all ranger should be around?

That's a very good question ;-). The current code does not use the ranger to solve unknowns. Anything further back than the first block in a path will return varying. The plan was to add this ranger solving functionality as a follow-up.

I have a whole slew of pending patches adding precisely this functionality. We should be able to solve ranges outside the path with ranger, as well as relationals occurring in a path.

However, even if there are alternate ways of threading this IL, something like 5->9->3 could still happen. We need a way to disallow this. I'm having a hard time determining the hammer for this. I would vote for disabling threading through latches, but it seems the backward threader is aware of this scenario and allows it anyhow (see threaded_through_latch variable). Ughh.

Thanks for looking into this.
Aldy

Reply via email to