On 15.03.2024 14:48, Andrew Cooper wrote:
> On 14/03/2024 5:14 pm, Andrew Cooper wrote:
>> On 14/03/2024 3:59 pm, Jan Beulich wrote:
>>> On 13.03.2024 18:27, Andrew Cooper wrote:
>>>> --- a/xen/arch/x86/include/asm/bitops.h
>>>> +++ b/xen/arch/x86/include/asm/bitops.h
>>>> @@ -401,18 +401,6 @@ static always_inline unsigned int __scanbit(unsigned 
>>>> long val, unsigned int max)
>>>>      r__;                                                                  
>>>>   \
>>>>  })
>>>>  
>>>> -/**
>>>> - * find_first_set_bit - find the first set bit in @word
>>>> - * @word: the word to search
>>>> - * 
>>>> - * Returns the bit-number of the first set bit. The input must *not* be 
>>>> zero.
>>>> - */
>>>> -static inline unsigned int find_first_set_bit(unsigned long word)
>>>> -{
>>>> -    asm ( "rep; bsf %1,%0" : "=r" (word) : "rm" (word) );
>>>> -    return (unsigned int)word;
>>>> -}
>>> And you think it's okay to no longer use TZCNT like this when available,
>>> where the output doesn't have to have its value set up front?
>> This is a particularly evil piece of inline asm.
>>
>> It is interpreted as BSF or TZCNT depending on the BMI instruction set
>> (Haswell/Piledriver era).  Furthermore there are errata on some Intel
>> systems where REP BSF behaves as per TZCNT *even* when BMI isn't enumerated.
>>
>> Which means this piece of asm suffers from all of an undefined output
>> register, undefined CF behaviour, and differing ZF behaviour (I believe)
>> depending on which hardware you're running on.
>>
>> The only thing the REP prefix is getting you is a deterministic 0 in the
>> destination register,
> 
> No, it doesn't.
> 
> For a zero input, TZCNT yields the operand size, so you get 16/32/64; 64
> in this case.
> 
> It also means there's no chance of coming up with a useful alternative
> for ffs() to use TZCNT when available.

Right, for ffs() TZCNT isn't suitable. But for find_first_set_bit() it was,
yielding a reliably out-of-range output for zero input (which BSF wouldn't
guarantee).

Jan

Reply via email to