On Fri, Apr 25, 2014 at 3:36 PM, Matt Turner <matts...@gmail.com> wrote: > On Fri, Apr 25, 2014 at 10:41 AM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >> This is enough to catch up to core mesa, with the exception of >> uaddCarry/usubBorrow -- those will require some thought. I don't like the way >> they were done in core mesa, so I may redo it differently. (Will start a >> discussion on that topic after I've given it more thought.) > > I'm not sure you have all of the context. GLSL IR is pure, meaning > expressions have no side effects. uaddCarry and usubBorrow have side > effects. > > Short of implementing an intrinsic system or something, which seems > like massive overkill for these built-ins, you have to split them. > > Everyone I talked to about it at the office thought it was a really > elegant solution, especially given that some other hardware implements > instructions for the split pieces and that a peephole could recombine > them for hardware that has a combined instruction. I didn't just > implement the first thing that came to mind.
Sorry, I didn't mean for that to come off as implying that the current design was in any way bad or stupid. What you've implemented is perfectly logical and reasonable. However it makes my life a little more difficult, and I believe there's a similarly clean and elegant solution that also has the advantage of not making my life more difficult exists. Basically instead of the glsl ir for uaddCarry being emitted as a = carry(x, y) b = uadd(x, y) Perhaps it can be emitted as b = uadd(x, y) a = (x > b) And then you have a peephole pass that looks for this and converts it into a single instruction. Additionally this has the advantage of working on code where people manually implemented uaddCarry (although there are other ways to implement it, and this would only detect one of them). The problem with the current way is that (a) I'd have to add yet-more TGSI instructions to pipe this through (not my favourite activity), and (b) would need to do the lowering + hope CSE takes care of the duplicate ADD instruction (nvc0 doesn't really have an ADDC... there's a flag you can set when doing the ADD, but I guess I'll need to test out when it gets set). With the alternative above, no need for new instructions, and everything Just Works (tm). And those ISA's that actually have an ADDC would need a peephole (or similar) pass to make use of it anyways, so they'll just detect the alterate instruction sequence. -ilia _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev