On Fri, Feb 13 2015, "George Spelvin" <li...@horizon.com> wrote:

>> the main loop is 20--3b. The test instruction at 2e seems to be
>> redundant. The same at 37: the sub instruction already sets plenty of
>> flags that could be used, so explicitly comparing %rbx to -1 seems
>> redundant.
>
> Er... I think you hand-edited that code; it's wrong.  The loop assumes that
> %rbx is in units of words, but the prologue sets it up in units of bits.

No, but I messed up the source by hand :-) My DIV_ROUND_UP macro was
bogus. Well spotted. Fixing that I still see the redundant cmp and
test, though.

> The mov to %rcx is also redundant, since it could be eliminated with
> some minor rescheduling.
>
> The code generation I *want* for that function is:
>
> # addr in %rdi, size in %rsi
>       movl    %esi, %ecx
>       leaq    0x3f(%rsi), %rax
>       negl    %ecx
>       movq    $-1, %rdx
>         shrq  $6, %rax
>       shrq    %cl, %rdx
>       jmp     2f
> 1:
>       movq    $-1, %rdx
> 2:
>       subq    $1, %rax
>       jc      3f
>       andq    (%rdi,%rax,8), %rdx
>       jeq     1b
>
>       bsrq    %rdx, %rdx
>         salq    $6, %rax
>       addq    %rdx, %rax
>         ret
> 3:
>       movq    %rsi, %rax
>       retq

Nice. But I don't think find_last_bit is important enough to warrant
arch-specific versions.

So, where are we with this? Have we reached some kind of consensus?

Rasmus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to