On Thu, Oct 14, 2021 at 2:46 PM Andrew MacLeod <amacl...@redhat.com> wrote: > > On 10/14/21 4:27 AM, Richard Biener wrote: > > On Wed, Oct 13, 2021 at 10:58 PM Andrew MacLeod <amacl...@redhat.com> wrote: > >> As work has progressed, we're pretty close to being able to functionally > >> replace VRP with another EVRP pass. At least it seems close enough that > >> we should discuss if thats something we might want to consider for this > >> release. Replacing just one of the 2 VRP passes is another option. > >> > >> First, lets examine simplifications/folds. > >> > >> Running over my set of 380 GCC source files, we see the following > >> results for number of cases we currently get: > >> > >> Number of EVRP cases : 5789 > >> Number of VRP1 cases : 4913 > >> Number of VRP2 cases : 279 > >> combined VRP1/2: 5192 > >> > >> The 2 passes of VRP get a total of 5192 cases. > >> > >> If we run EVRP instead of each VRP pass, we get the following results: > >> > >> Number of EVRP1 cases : 5789 > >> Number of EVRP2 cases : 7521 > >> Number of EVRP3 cases : 2240 > >> combined EVRP2/3: 9761 > >> > >> so the EVRP passes find an additional 4569 opportunities, or 88% more. > >> This initially surprised me until it occurred to me that this is > >> probably due to the pruning require for ASSERT_EXPRs, which means it > >> never had a chance to see some of those cases. Notice how the second > >> pass appears far more effective now. > >> > >> Regarding what we would miss if we took VRP out, if we run EVRP passes > >> first then a VRP pass immediately after, we see what VRP finds that EVRP > >> cannot: > >> > >> Number of EVRP2 cases : 7521 > >> Number of VRP1 cases : 11 > >> Number of EVRP3 cases : 2269 > >> Number of VRP2 cases : 54 > >> > >> I have looked at some of these, and so far they all appear to be cases > >> which are solved via the iteration model VRP uses. > > Most should be now handled by using SCEV analysis on PHIs rather > > than VRP iteration so can you share an example where the iteration helps? > > I didn't delve too deeply since I was skimming for whats missing, and > they all seemed to be in very large programs with cyclic phis feeding > each other. > > I'll pick one shortly and do a deeper dive. > > > > > >> regardless, missing > >> 65 cases and getting 4569 new ones would seem to be a win. I will > >> continue to investigate them. > >> > >> == Performance == > >> > >> The threading work has been pulled out of VRP, so we get a better idea > >> of what VRPs time really is. we're showing about a 58% slowdown in VRP > >> over the 2 passes. I've begun investigating because it shouldn't be off > >> by that much, Im seeing a lot of excess time being wasted with callback > >> queries from the substitute_and_fold engine when processing PHIs. It > >> should be possible to bring this all back in line as that isnt the model > >> ranger should be using anyway. > >> > >> I figured while I'm looking into the performance side of it, maybe we > >> should start talking about whether we want to replace one or both of the > >> VRP passes with an EVRP instance. > >> > >> > >> I see 3 primary options: > >> 1 - replace both VRP passes with EVRP instances. > >> 2 - replace VRP2 with EVRP2 > >> 3 - Replace neither, leave it as is. > >> > >> I figure since the second pass of VRP doesn't get a lot to start with, > >> it probably doesn't make sense to replace VRP1 and not VRP2. > >> > >> Option 1 is what I would expect the strategic move next release to be, > >> it seems ready now, its just a matter of whether we want to give it more > >> time. It would also be trivial to turn VRP back on for one for both > >> later in the cycle if we determines there was something important missing. > >> > >> option 2 is something we ought to really consider if we don't want to do > >> option 1. There are a few PRs that are starting to open that have VRP > >> not getting something due to the whims of more precise mutli-ranges > >> being converted back to a value_range, and replacing VRP2 would allow us > >> to catch those.. plus, we pick up a lot more than VRP2 does. > >> > >> I would propose we add a param, similar to what EVRP has which will > >> allow us to choose which pass is called for VRP1 and VRP2 and set our > >> defaults appropriately. I wouldn't work with a hybrid like we did with > >> EVRP... just choose which pass runs. And we'll have to adjust some > >> testcases based one whatever our default is. > >> > >> Thoughts? > >> > >> Personally I think we give option 1 a go and if something shows up over > >> the next couple of months, or we cant get performance in line with > >> where we want it, then we can switch back to VRP for one or both > >> passes. I wouldn't expect either, but one never knows :-) > >> > >> If that isn't palatable for everyone, then I'd suggest option 2 > > How far are you with handling the symbolic range cases VRP gets > > with the relation work? The symbolic range handling is important > > for Ada IIRC. I would of course have hoped the testsuite would > > catch important regressions there (did you test Ada?) > > I always test ada, objc.. anything that can be built as part of my > normal build :-) so yes. > > As far as I am aware, we are not missing any symbolic/relational cases, > that has all been functioning for a couple of months. > > > > I think option 2 would be safest at least so any important > > iteration/symbolic > > work is done early. > > > Certainly works as a good first step.
So let's pull the trigger on that one? Richard. > Andrew >