On Mon, Apr 28, 2014 at 4:47 PM, Matt Turner <matts...@gmail.com> wrote: > On Fri, Apr 25, 2014 at 12:52 PM, Ilia Mirkin <imir...@alum.mit.edu> wrote: >> And then you have a peephole pass that looks for this and converts it >> into a single instruction. Additionally this has the advantage of >> working on code where people manually implemented uaddCarry (although >> there are other ways to implement it, and this would only detect one >> of them). > > CMP on i965 returns 0 or 0xFFFFFFFF rather than 0 or 1. It also > unconditionally writes the flag register, which won't be used, and > will only cause scheduling conflicts.
I actually noticed the 0xFFFFFFFF on my own as well -- USLT & co TGSI instructions are specified to return that as well. Very sad. > I think I'd prefer you do a carry -> SLT lowering pass until I have a > peephole pass in place. But I'm also not really concerned about > performance from this at the moment either... Well, given the fact that USLT doesn't _actually_ do what I want (but sooooo close!), I may abandon my desire to implement this in a different way. I'm going to investigate it a bit more before I do anything on that front though. This isn't a particularly critical operation. Perhaps the lowering pass in glsl is actually a good way to go, esp since nvc0 doesn't have more dedicated instructions for it. And radeon/whoever cares can do something more optimized (similar to how the 4-offset TG4 was handled). (Or perhaps it's simple enough to just be detected via a peephole pass in the radeon compiler, not at all familiar with the structure there.) -ilia _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev