On 24.05.2024 22:03, Andrew Cooper wrote:
> The asm in arch_ffs() is safe but inefficient.
> 
> CMOV would be an improvement over a conditional branch, but for 64bit CPUs
> both Intel and AMD have provided enough details about the behaviour for a zero
> input.  It is safe to pre-load the destination register with -1 and drop the
> conditional logic.
> 
> However, it is common to find ffs() in a context where the optimiser knows
> that x in nonzero even if it the value isn't known precisely, and in that case
> it's safe to drop the preload of -1 too.
> 
> There are only a handful of uses of ffs() in the x86 build, and all of them
> improve as a result of this:
> 
>   add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-31 (-31)
>   Function                                     old     new   delta
>   mask_write                                   114     107      -7
>   xmem_pool_alloc                             1063    1039     -24
> 
> Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>

Reviewed-by: Jan Beulich <jbeul...@suse.com>
with one suggestion:

> --- a/xen/arch/x86/include/asm/bitops.h
> +++ b/xen/arch/x86/include/asm/bitops.h
> @@ -432,12 +432,28 @@ static inline int ffsl(unsigned long x)
>  
>  static always_inline unsigned int arch_ffs(unsigned int x)
>  {
> -    int r;
> +    unsigned int r;
> +
> +    if ( __builtin_constant_p(x > 0) && x > 0 )

__builtin_constant_p(x) surely will do. In fact even the other "> 0" could
in principle be left out here.

Jan

Reply via email to