On 14/07/2015 18:45, Aurelien Jarno wrote:
>>> > > 
>>> > >     mask = 0x7fffffffffffffffull >> (t1 ^ 63)
>>> > > 
>>> > > It's simpler to generate it by doing:
>>> > > 
>>> > >     mask = (1 << t1) - 1
>> > 
>> > Using ~(-1 << t1) may let you use an ANDN instruction, and is also the
>> > same number of instructions on x86.
>> > 
> Indeed thanks for the hint. The generated code has the same size, but is
> one instruction less:
> 
>    mov    0x88(%rsp),%r10
>    shlx   %r10,%rbx,%rbx
> -  mov    $0x1,%r11d
> +  mov    $0xffffffffffffffff,%r11
>    shlx   %r10,%r11,%r11
> -  dec    %r11
>    mov    0x18(%r14),%r10
> -  and    %r11,%r10
> +  andn   %r10,%r11,%r10
>    or     %r10,%rbx
>    movslq %ebx,%rbx

Oh, indeed I forgot about the fancy new x86 bit manipulation
instructions!  Even better. :)

Paolo

Reply via email to