Hi Thomas, Thanks for submitting a proposal! Some comments/answers below.
On Tue, Apr 7, 2015 at 3:34 PM, Thomas Helland <thomashellan...@gmail.com> wrote: > Hi, > > For those that don't know I've submitted a proposal for this years GSoC. > I've proposed to implement value range propagation and loop unrolling in > NIR. > Since I'm no expert on compilers I've read up on some litterature: > > I started with "Constant propagation with conditional branches" (thanks > Connor). > This paper describes an algorithm, "sparse conditional constant > propagation", > that seems to be the defacto standard in compilers today. > > I also found the paper; > "Accurate static branch prediction by value range propagation " (VRP). > This describes a value range propagation implementation based on SCCP. > (This also allows one to set heuristics to calculate educated guesses for > the > probability of a certain branch, but that's probably more than we're > interested in.) Thanks for mentioning that... I had forgotten the name of that paper. You're right in that the branch probability stuff isn't too useful for us. Also, it raises an important issue about back-edges from phi nodes; they present a more sophisticated method to handle it, but I think that for now we can just force back edges to have an infinite range unless they're constant. > > There is also a GCC paper (with whatever licensing issues that may apply); > "A propagation engine for GCC". > They have a shared engine for doing all propagation passes. > It handles the worklists, and the logic to traverse these. > The implementing passes then supply callbacks to define the lattice rules. > They reply back if the instruction was interesting or not, > and the propagation engine basically handles the rest. > > Maybe that's an interesting solution? Or it might not be worth the hassle? > We already have copy propagation, and with value range propagation > we probably don't want separate constant propagation? > (I'm hoping to write the pass so that it handles both constants and value > ranges.) Yes, copy propagation probably won't be so useful once we have value range propagation; the former is a special case of the latter. Note that we have a nifty way of actually doing the constant folding (nir_constant_expressions.py and nir_constant_expressions.h), which you should still use if all the inputs are constant. > The GCC guys have used this engine to get copy propagation that propagates > copies accross conditionals, maybe this makes such a solution more > interesting? I'm not so sure how useful such a general framework will be. Constant propagation that handles back-edges seems interesting, but I'm not sure it's worth the time to implement something this general as a first pass. > > Connor: I just remembered you saying something about your freedesktop > git repo, so I poked around some and found that you have already done > some work on VRP based on SCCP? How far did you get? I started on it, but then I realized that the approach I was using was too cumbersome/complicated so I don't think what I have is too useful. Feel free to work on it yourself, although Jason and I have discussed it so we have some ideas of how to do it. I've written a few notes on this below that you may find useful. - I have a branch I created while working on VRP that you'll probably find useful: http://cgit.freedesktop.org/~cwabbott0/mesa/log/?h=nir-worklist . The first two commits are already in master, but the last two should be useful for implementing SCCP/VRP (although they'll need to be rebased, obviously). - There's a comment in the SCCP paper (5.3, Nodes versus Edges) that says: "An alternative way of implementing this would be to add nodes to the graph and then associate an ExecutableFlag with each node. An additional node must be inserted between any node that has more than one immediate successor and any successor node that has more than one immediate predecessor." I think this procedure is what's usually called "splitting critical edges"; in NIR, thanks to the structured control flow, there are never any critical edges except for one edge case you don't really have to care about too much (namely, an infinite loop with one basic block) and therefore you can just use the basic block worklist that I added in the branch mentioned above, rather than a worklist of basic block edges as the paper describes. - The reason my pass was becoming so cumbersome was because I was trying to solve two problems at once. First, there's actually propagating the ranges. Then, there's taking into account restrictions on range due to branch predicates. For example, if I have something like: if (x > 0) { y = max(x, 0); } then since the use of x is dominated by the then-branch of the if, x has to be greater than 0 and we can optimize it. This is a little contrived, but we have seen things like: if (foo) break; /* ... */ if (foo) break; in the wild, where we could get rid of the redundant break using this analysis by recognizing that since the second condition is dominated by the else-branch of the first, foo must be false there. I was trying to handle this by storing multiple lattice entries for the same SSA value, but it was becoming too messy. Instead, we can solve the first problem normally, and then to solve the second problem we can create new SSA values, using the standard SSA construction algorithm, any time where after a certain point the range of a value is restricted (namely, the condition of a branch or the index of an array dereference). In the first example, we would create a new value x2: if (x > 0) { x2 = x; y = max(x2, 0); } and the VRP pass will keep track of "special" copies like the one from x to x2 that add restrictions on the range. After everything is finished, copy propagation and DCE will clean up the extra copies created. There's a paper on this somewhere, but I don't quite remember the name of it. I'm not sure if you'll be able to get to this over the summer, but I thought I'd explain it in case you were interested. > > If we just want to make an SCCP inspired VRP pass then Connor has work in > progress. > Finishing that, and loop unrolling, may not be enough work for GSoC? I'm not sure... on the one hand, there's enough here that it may take the entire summer for you to do. On the other hand, it's possible that you'll be able to finish it with enough time left over, and there are plenty of other things you'd be able to do. It depends on several factors, and no one has a crystal ball here. I'm not experienced enough with GSoC to be able to give you a recommendation, so it would be nice for someone more experienced to give their opinion. > Or maybe Connor wants to finish it of himself, and I should spend my time > implementing some other pass instead, alongside loop unrolling? > > Realising Connor has partially started on this I thought it was a good > idea to get some feedback and ideas from others (if I need to change my > proposal) > All suggestions, ideas and opinions are more than welcome. > Fire at will, I'm all ears =) > > Regards, > Thomas Thanks, Connor _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev