On 12/17/2009 07:32 AM, malc wrote:
These new opcodes are considered "required" by the backend, because expanding them at the tcg level breaks the basic block. There might be some way to emulate within tcg internals, but that doesn't seem worthwhile, as essentially all hosts have some form of support for these.
... > c. Historically things like that were made conditional with > a generic fallback (bswap, neg, not, rot, etc)
I answered this one above. A generic fallback would break the basic block, which would break TCGs simple register allocation.
b. Documentation for movcond has a typo, t0 is assigned not t1
Oops. Will fix.
d. Documentation for setcond2 is missing
Ah, I see that brcond2 is missing as well; I'll fix that too.
It would also be interesting to learn what impact adding those two has on performance, any results?
Hmph, not as much as I would have liked. I suppose Intel is getting pretty darned good with its branch prediction. It shaved about 3 minutes off 183.equake from what I posted earlier this week; that's something around a 7% improvement, assuming it's not just all noise (I havn't run that test enough times to see what the variation is).
+ case TCG_COND_NE: + if (const_arg2) { + if ((uint16_t) arg2 == arg2) { + tcg_out32 (s, XORI | RS (arg1) | RA (0) | arg2); + } + else { + tcg_out_movi (s, TCG_TYPE_I32, 0, arg2); + tcg_out32 (s, XOR | SAB (arg1, 0, 0)); + } + } + else { + tcg_out32 (s, XOR | SAB (arg1, 0, arg2)); + } + + tcg_out32 (s, ADDIC | RT (arg0) | RA (0) | 0xffff); + tcg_out32 (s, SUBFE | TAB (arg0, arg0, 0)); + return;
Heh, you know a trick that gcc doesn't for powerpc. It just adds an xor at the end of the EQ sequence.
r~