On Thu, Oct 14, 2021 at 2:46 PM Andrew MacLeod <amacl...@redhat.com> wrote:
>
> On 10/14/21 4:27 AM, Richard Biener wrote:
> > On Wed, Oct 13, 2021 at 10:58 PM Andrew MacLeod <amacl...@redhat.com> wrote:
> >> As work has progressed, we're pretty close to being able to functionally
> >> replace VRP with another EVRP pass.  At least it seems close enough that
> >> we should discuss if thats something we might want to consider for this
> >> release.   Replacing just one of the 2 VRP passes is another option.
> >>
> >> First, lets examine simplifications/folds.
> >>
> >> Running over my set of 380 GCC source files, we see the following
> >> results for number of cases we currently get:
> >>
> >> Number of EVRP cases :       5789
> >> Number of VRP1 cases :       4913
> >> Number of VRP2 cases :       279
> >>                combined VRP1/2:   5192
> >>
> >> The 2 passes of VRP get a total of 5192 cases.
> >>
> >> If we run EVRP instead of each VRP pass, we get the following results:
> >>
> >> Number of EVRP1 cases :       5789
> >> Number of EVRP2 cases :       7521
> >> Number of EVRP3 cases :       2240
> >>            combined EVRP2/3:       9761
> >>
> >> so the EVRP passes find an additional 4569 opportunities,  or 88% more.
> >> This initially surprised me until it occurred to me that this is
> >> probably due to the pruning require for ASSERT_EXPRs, which means it
> >> never had a chance to see some of those cases. Notice how the second
> >> pass appears far more effective now.
> >>
> >> Regarding what we would miss if we took VRP out, if we run  EVRP passes
> >> first then a VRP pass immediately after, we see what VRP finds that EVRP
> >> cannot:
> >>
> >> Number of EVRP2 cases :      7521
> >> Number of VRP1 cases :       11
> >> Number of EVRP3 cases :      2269
> >> Number of VRP2 cases :       54
> >>
> >> I have looked at some of these, and so far they all appear to be cases
> >> which are solved via the iteration model VRP uses.
> > Most should be now handled by using SCEV analysis on PHIs rather
> > than VRP iteration so can you share an example where the iteration helps?
>
> I didn't delve too deeply since I was skimming for whats missing, and
> they all seemed to be in very large programs with cyclic phis feeding
> each other.
>
> I'll pick one shortly and do a deeper dive.
>
>
> >
> >> regardless, missing
> >> 65 cases and getting 4569 new ones would seem to be a win. I will
> >> continue to investigate them.
> >>
> >> == Performance ==
> >>
> >> The threading work has been pulled out of VRP, so we get a better idea
> >> of what VRPs time really is. we're showing about a 58% slowdown in VRP
> >> over the 2 passes.  I've begun investigating because it shouldn't be off
> >> by that much, Im seeing a lot of excess time being wasted with callback
> >> queries from the substitute_and_fold engine when processing PHIs.  It
> >> should be possible to bring this all back in line as that isnt the model
> >> ranger should be using anyway.
> >>
> >> I figured while I'm looking into the performance side of it, maybe we
> >> should start talking about whether we want to replace one or both of the
> >> VRP passes with an EVRP instance.
> >>
> >>
> >> I see 3 primary options:
> >> 1 - replace both VRP passes with EVRP instances.
> >> 2 - replace VRP2 with EVRP2
> >> 3 - Replace neither, leave it as is.
> >>
> >> I figure since the second pass of VRP doesn't get a lot to start with,
> >> it probably doesn't make sense to replace VRP1 and not VRP2.
> >>
> >> Option 1 is what I would expect the strategic move next release to be,
> >> it seems ready now, its just a matter of whether we want to give it more
> >> time.  It would also be trivial to turn VRP back on for one for both
> >> later in the cycle if we determines there was something important missing.
> >>
> >> option 2 is something we ought to really consider if we don't want to do
> >> option 1.  There are a few PRs that are starting to open that have VRP
> >> not getting something due to the whims of more precise mutli-ranges
> >> being converted back to a value_range, and replacing VRP2 would allow us
> >> to catch those..  plus, we pick up a lot more than  VRP2 does.
> >>
> >> I would propose we add a param, similar to what EVRP has which will
> >> allow us to choose which pass is called for VRP1 and VRP2 and set our
> >> defaults appropriately.  I wouldn't work with a hybrid like we did with
> >> EVRP... just choose which pass runs.     And we'll have to adjust some
> >> testcases based one whatever our default is.
> >>
> >> Thoughts?
> >>
> >> Personally I think we give option 1 a go and if something shows up over
> >> the next couple of months, or  we cant get performance in line with
> >> where we want it, then we can switch back to VRP for one or both
> >> passes.  I wouldn't  expect either, but one never knows :-)
> >>
> >> If that isn't palatable for everyone, then I'd suggest option 2
> > How far are you with handling the symbolic range cases VRP gets
> > with the relation work?  The symbolic range handling is important
> > for Ada IIRC.  I would of course have hoped the testsuite would
> > catch important regressions there (did you test Ada?)
>
> I always test ada, objc.. anything that can be built as part of my
> normal build :-)  so yes.
>
> As far as I am aware, we are not missing any symbolic/relational cases,
> that has all been functioning for a couple of months.
>
>
> > I think option 2 would be safest at least so any important 
> > iteration/symbolic
> > work is done early.
> >
> Certainly works as a good first step.

So let's pull the trigger on that one?

Richard.

> Andrew
>

Reply via email to