Thanks for the lengthy response :) 8. apr. 2015 01.52 skrev "Connor Abbott" <cwabbo...@gmail.com>: > > Hi Thomas, > > Thanks for submitting a proposal! Some comments/answers below. > > On Tue, Apr 7, 2015 at 3:34 PM, Thomas Helland > <thomashellan...@gmail.com> wrote: > > Hi, > > > > For those that don't know I've submitted a proposal for this years GSoC. > > I've proposed to implement value range propagation and loop unrolling in > > NIR. > > Since I'm no expert on compilers I've read up on some litterature: > > > > I started with "Constant propagation with conditional branches" (thanks > > Connor). > > This paper describes an algorithm, "sparse conditional constant > > propagation", > > that seems to be the defacto standard in compilers today. > > > > I also found the paper; > > "Accurate static branch prediction by value range propagation " (VRP). > > This describes a value range propagation implementation based on SCCP. > > (This also allows one to set heuristics to calculate educated guesses for > > the > > probability of a certain branch, but that's probably more than we're > > interested in.) > > Thanks for mentioning that... I had forgotten the name of that paper. > You're right in that the branch probability stuff isn't too useful for > us. Also, it raises an important issue about back-edges from phi > nodes; they present a more sophisticated method to handle it, but I > think that for now we can just force back edges to have an infinite > range unless they're constant. > > > > > There is also a GCC paper (with whatever licensing issues that may apply); > > "A propagation engine for GCC". > > They have a shared engine for doing all propagation passes. > > It handles the worklists, and the logic to traverse these. > > The implementing passes then supply callbacks to define the lattice rules. > > They reply back if the instruction was interesting or not, > > and the propagation engine basically handles the rest. > > > > Maybe that's an interesting solution? Or it might not be worth the hassle? > > We already have copy propagation, and with value range propagation > > we probably don't want separate constant propagation? > > (I'm hoping to write the pass so that it handles both constants and value > > ranges.) > > Yes, copy propagation probably won't be so useful once we have value > range propagation; the former is a special case of the latter. Note > that we have a nifty way of actually doing the constant folding > (nir_constant_expressions.py and nir_constant_expressions.h), which > you should still use if all the inputs are constant. >
I'll have a look at nir_constant_expressions to see what it's about. I was thinking of making a separate struct with some flags and the range of the assignment. I thought of having an is_constant flag in there to keep control of if the value is constant. Or do we have a metadata system for sticking useful information in that gets shared between all passes? > > The GCC guys have used this engine to get copy propagation that propagates > > copies accross conditionals, maybe this makes such a solution more > > interesting? > > I'm not so sure how useful such a general framework will be. Constant > propagation that handles back-edges seems interesting, but I'm not > sure it's worth the time to implement something this general as a > first pass. > I agree with you on this one. If there was a potential for a lot of code sharing it would be more appealing. But if we drop having a separate constant propagation pass then I see only VRP and copy prop. as potential users. That's not a large user base for a generalized framework. > > > > Connor: I just remembered you saying something about your freedesktop > > git repo, so I poked around some and found that you have already done > > some work on VRP based on SCCP? How far did you get? > > I started on it, but then I realized that the approach I was using was > too cumbersome/complicated so I don't think what I have is too useful. > Feel free to work on it yourself, although Jason and I have discussed > it so we have some ideas of how to do it. I've written a few notes on > this below that you may find useful. > > - I have a branch I created while working on VRP that you'll probably > find useful: http://cgit.freedesktop.org/~cwabbott0/mesa/log/?h=nir-worklist > . The first two commits are already in master, but the last two should > be useful for implementing SCCP/VRP (although they'll need to be > rebased, obviously). > > - There's a comment in the SCCP paper (5.3, Nodes versus Edges) that > says: "An alternative way of implementing this would be to add nodes > to the > graph and then associate an ExecutableFlag with each node. An > additional node must be inserted between any node that has more than > one immediate successor and any successor node that has more than one > immediate predecessor." This is the method mentioned in the VRP (If I remember correctly). > I think this procedure is what's usually > called "splitting critical edges"; in NIR, thanks to the structured > control flow, there are never any critical edges except for one edge > case you don't really have to care about too much (namely, an infinite > loop with one basic block) and therefore you can just use the basic > block worklist that I added in the branch mentioned above, rather than > a worklist of basic block edges as the paper describes. > > - The reason my pass was becoming so cumbersome was because I was > trying to solve two problems at once. First, there's actually > propagating the ranges. Then, there's taking into account restrictions > on range due to branch predicates. For example, if I have something > like: > > if (x > 0) { > y = max(x, 0); > } > > then since the use of x is dominated by the then-branch of the if, x > has to be greater than 0 and we can optimize it. This is a little > contrived, but we have seen things like: > > if (foo) > break; > > /* ... */ > > if (foo) > break; > > in the wild, where we could get rid of the redundant break using this > analysis by recognizing that since the second condition is dominated > by the else-branch of the first, foo must be false there. I was trying > to handle this by storing multiple lattice entries for the same SSA > value, but it was becoming too messy. Instead, we can solve the first > problem normally, and then to solve the second problem we can create > new SSA values, using the standard SSA construction algorithm, any > time where after a certain point the range of a value is restricted > (namely, the condition of a branch or the index of an array > dereference). In the first example, we would create a new value x2: > I thnk I've seen this solution mentioned in some paper. Maybe the ISRP pass mentioned below? I don't remember. > if (x > 0) { > x2 = x; > y = max(x2, 0); > } > > and the VRP pass will keep track of "special" copies like the one from > x to x2 that add restrictions on the range. After everything is > finished, copy propagation and DCE will clean up the extra copies > created. There's a paper on this somewhere, but I don't quite remember > the name of it. I'm not sure if you'll be able to get to this over the > summer, but I thought I'd explain it in case you were interested. > May this paper be "Interprocedural symbolic range propagation"? Or maybe just " symbolic range propagation"? I haven't looked much at them, but looks like they might be relevant. > > > > If we just want to make an SCCP inspired VRP pass then Connor has work in > > progress. > > Finishing that, and loop unrolling, may not be enough work for GSoC? > > I'm not sure... on the one hand, there's enough here that it may take > the entire summer for you to do. On the other hand, it's possible that > you'll be able to finish it with enough time left over, and there are > plenty of other things you'd be able to do. It depends on several > factors, and no one has a crystal ball here. I'm not experienced > enough with GSoC to be able to give you a recommendation, so it would > be nice for someone more experienced to give their opinion. > That's what I struggle with. While I've done some programming projects before they have been quite small, with a large amount of hardware that needed to be made, so that was the time consuming part and not the programming. Setting a realistic plan is .... hard. > > Or maybe Connor wants to finish it of himself, and I should spend my time > > implementing some other pass instead, alongside loop unrolling? > > > > Realising Connor has partially started on this I thought it was a good > > idea to get some feedback and ideas from others (if I need to change my > > proposal) > > All suggestions, ideas and opinions are more than welcome. > > Fire at will, I'm all ears =) > > > > Regards, > > Thomas > > Thanks, > Connor
_______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev