On 7/30/21 10:39 AM, Richard Sandiford wrote:
Aldy Hernandez <al...@redhat.com> writes:
On Mon, Jul 26, 2021 at 7:28 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:

Aldy Hernandez <al...@redhat.com> writes:
On Mon, Jul 26, 2021 at 4:18 PM Richard Sandiford
<richard.sandif...@arm.com> wrote:

Aldy Hernandez <al...@redhat.com> writes:
This patch replaces the evrp_range_analyzer in the loop versioning code
with an on-demand ranger.

Everything was pretty straightforward, except that range_of_expr requires
a gimple statement as context to provide context aware ranges.  I didn't see
a convient place where the statement was saved, so I made a vector indexed
by SSA names.  As an alternative, I tried to use the loop's first statement,
but that proved to be insufficient.

The mapping is one-to-many though: there can be multiple statements
for each SSA name.  Maybe that doesn't matter in this context and
any of the statements can act as a representative.

I'm surprised that the loop's first statement didn't work though,
since the SSA name is supposedly known to be loop-invariant.  What went
wrong when you tried that?

I was looking at the first statement of loop_info->block_list and one
of the dg.exp=loop-versioning* tests failed.  Perhaps I should have
used the loop itself, as in the attached patch.  With this patch all
of the loop-versioning tests pass.


I am not familiar with loop versioning, but if the DOM walk was only
necessary for the calls to record_ranges_from_stmt, this too could be
removed as the ranger will work without it.

Yeah, that was the only reason.  If the information is available at
version_for_unity (I guess it is) then we should just avoid recording
the versioning there if so.

How expensive is the check?  If the result is worth caching, perhaps
we should have two bitmaps: the existing one, and one that records
whether we've checked a particular SSA name.

If the check is relatively cheap then that won't be worth it though.

If you're asking about the range_of_expr check, that's all cached, so
it should be pretty cheap.  Besides, we're no longer calculating
ranges for each statement in the IL, as we were doing in lv_dom_walker
with evrp's record_ranges_from_stmt.  Only statements of interest are
queried.

Sounds good.  If the results are already cached then another level
of caching (via the second bitmap I mentioned above) would obviously
be a waste of time.

My callgrind harness for performance testing wasn't able to pick up
enough samples to measure the time spent in
pass_loop_versioning::execute.  I've seen this happen before with
passes that run too fast.  I'm afraid I don't have enough cycles to
continue working on this.

Yeah, any testing of this was above and beyond IMO.  Hearing that the
range query does its own caching was enough for me. :-)

How about this patch, pending tests?

OK, thanks, as a strict improvement over the status quo.  But it'd be
even better without the dom walk :-)

I've removed the DOM walk, and re-tested.

OK to push?

Sorry for asking for another iteration, but…

It looks like this is a bit more involved than I originally envisioned.

I've pushed the original (approved) patch that just removes the use of evrp, which was my main goal.

I'll follow-up with the dom walk removal and your suggested changes next week when I have more cycles.

Aldy

Reply via email to