On 14/07/2015 18:45, Aurelien Jarno wrote: >>> > > >>> > > mask = 0x7fffffffffffffffull >> (t1 ^ 63) >>> > > >>> > > It's simpler to generate it by doing: >>> > > >>> > > mask = (1 << t1) - 1 >> > >> > Using ~(-1 << t1) may let you use an ANDN instruction, and is also the >> > same number of instructions on x86. >> > > Indeed thanks for the hint. The generated code has the same size, but is > one instruction less: > > mov 0x88(%rsp),%r10 > shlx %r10,%rbx,%rbx > - mov $0x1,%r11d > + mov $0xffffffffffffffff,%r11 > shlx %r10,%r11,%r11 > - dec %r11 > mov 0x18(%r14),%r10 > - and %r11,%r10 > + andn %r10,%r11,%r10 > or %r10,%rbx > movslq %ebx,%rbx
Oh, indeed I forgot about the fancy new x86 bit manipulation instructions! Even better. :) Paolo