On Wed, Jun 25, 2008 at 11:17:45AM -0500, Scott Wood wrote: > Gabriel Paubert wrote: > >On Wed, Jun 25, 2008 at 10:34:32AM -0500, Scott Wood wrote: > >>Kumar Gala wrote: > >>>>+/* Macros to workout the correct index for the FPR in the thread > >>>>struct */ > >>>>+#define FPRNUMBER(i) (((i) - PT_FPR0) >> 1) > >>>>+#define FPRHALF(i) (((i) - PT_FPR0) % 2) > >>>Have you looked at what the compiler spits out here to make sure we > >>>aren't getting a divide? Seems like we could use '& 0x1'. > >>GCC's not *that* dumb. However, you may get some unnecessary > >>sign-twiddling if "i" is signed. > > > >Not for modulo 2, it's only an even/odd choice and GCC > >implements that efficiently IIRC. For other powers of 2, > >making the left hand side unsigned helps the compiler. > > From this: > > int foo(int x) > { > return x % 2; > } > > I get this with -O3: > > foo: > mr 0,3 > srawi 3,3,1 > addze 3,3 > slwi 3,3,1 > subf 3,3,0 > blr > .size foo, .-foo > .ident "GCC: (GNU) 4.1.2" >
Indeed. Signed modulo results can be negative... There are probably better ways to implement this case on PPC, for example: rlwinm tmp,input,4,27,28 ; make shift amount from LSB and MSB lis result,0xff01 srw result,result,tmp ; result is now 0x00 for even, 0x01 for odd positive, ; and 0xff for odd negative extsb result,result No carry, shorter dependency length (although srw may be slow on Cell it seems, but addze may be worse). > Changing it to "x & 1", or to unsigned, gives this: > > foo: > rlwinm 3,3,0,31,31 > blr > .size foo, .-foo > .ident "GCC: (GNU) 4.1.2" > > Maybe newer GCCs are better? Nope, but unsigned is often better for the right shift. Gabriel _______________________________________________ Linuxppc-dev mailing list Linuxppc-dev@ozlabs.org https://ozlabs.org/mailman/listinfo/linuxppc-dev