On 15.03.2024 14:48, Andrew Cooper wrote: > On 14/03/2024 5:14 pm, Andrew Cooper wrote: >> On 14/03/2024 3:59 pm, Jan Beulich wrote: >>> On 13.03.2024 18:27, Andrew Cooper wrote: >>>> --- a/xen/arch/x86/include/asm/bitops.h >>>> +++ b/xen/arch/x86/include/asm/bitops.h >>>> @@ -401,18 +401,6 @@ static always_inline unsigned int __scanbit(unsigned >>>> long val, unsigned int max) >>>> r__; >>>> \ >>>> }) >>>> >>>> -/** >>>> - * find_first_set_bit - find the first set bit in @word >>>> - * @word: the word to search >>>> - * >>>> - * Returns the bit-number of the first set bit. The input must *not* be >>>> zero. >>>> - */ >>>> -static inline unsigned int find_first_set_bit(unsigned long word) >>>> -{ >>>> - asm ( "rep; bsf %1,%0" : "=r" (word) : "rm" (word) ); >>>> - return (unsigned int)word; >>>> -} >>> And you think it's okay to no longer use TZCNT like this when available, >>> where the output doesn't have to have its value set up front? >> This is a particularly evil piece of inline asm. >> >> It is interpreted as BSF or TZCNT depending on the BMI instruction set >> (Haswell/Piledriver era). Furthermore there are errata on some Intel >> systems where REP BSF behaves as per TZCNT *even* when BMI isn't enumerated. >> >> Which means this piece of asm suffers from all of an undefined output >> register, undefined CF behaviour, and differing ZF behaviour (I believe) >> depending on which hardware you're running on. >> >> The only thing the REP prefix is getting you is a deterministic 0 in the >> destination register, > > No, it doesn't. > > For a zero input, TZCNT yields the operand size, so you get 16/32/64; 64 > in this case. > > It also means there's no chance of coming up with a useful alternative > for ffs() to use TZCNT when available.
Right, for ffs() TZCNT isn't suitable. But for find_first_set_bit() it was, yielding a reliably out-of-range output for zero input (which BSF wouldn't guarantee). Jan