On 14/03/2024 5:14 pm, Andrew Cooper wrote:
> On 14/03/2024 3:59 pm, Jan Beulich wrote:
>> On 13.03.2024 18:27, Andrew Cooper wrote:
>>> --- a/xen/arch/x86/include/asm/bitops.h
>>> +++ b/xen/arch/x86/include/asm/bitops.h
>>> @@ -401,18 +401,6 @@ static always_inline unsigned int __scanbit(unsigned 
>>> long val, unsigned int max)
>>>      r__;                                                                   
>>>  \
>>>  })
>>>  
>>> -/**
>>> - * find_first_set_bit - find the first set bit in @word
>>> - * @word: the word to search
>>> - * 
>>> - * Returns the bit-number of the first set bit. The input must *not* be 
>>> zero.
>>> - */
>>> -static inline unsigned int find_first_set_bit(unsigned long word)
>>> -{
>>> -    asm ( "rep; bsf %1,%0" : "=r" (word) : "rm" (word) );
>>> -    return (unsigned int)word;
>>> -}
>> And you think it's okay to no longer use TZCNT like this when available,
>> where the output doesn't have to have its value set up front?
> This is a particularly evil piece of inline asm.
>
> It is interpreted as BSF or TZCNT depending on the BMI instruction set
> (Haswell/Piledriver era).  Furthermore there are errata on some Intel
> systems where REP BSF behaves as per TZCNT *even* when BMI isn't enumerated.
>
> Which means this piece of asm suffers from all of an undefined output
> register, undefined CF behaviour, and differing ZF behaviour (I believe)
> depending on which hardware you're running on.
>
> The only thing the REP prefix is getting you is a deterministic 0 in the
> destination register,

No, it doesn't.

For a zero input, TZCNT yields the operand size, so you get 16/32/64; 64
in this case.

It also means there's no chance of coming up with a useful alternative
for ffs() to use TZCNT when available.

~Andrew

Reply via email to