It turns out that GCCs 4.9.2 and 6.3.0 instantiate __scanbit() in three
translation units, but never references the result.  All real uses of
__scanbit() are already suitably inline.

Signed-off-by: Andrew Cooper <andrew.coop...@citrix.com>
---
CC: Jan Beulich <jbeul...@suse.com>

Forcing __scanbit() to be always_inline appears to cause GCC to reorder some
of its basic blocks, so there is a moderately large perturbance to functions.
As far as I can see, even the register scheduling is the same, and the delta
is just changes in the nops used to align the basic blocks.
---
 xen/include/asm-x86/bitops.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/include/asm-x86/bitops.h b/xen/include/asm-x86/bitops.h
index fd494e8..0f18645 100644
--- a/xen/include/asm-x86/bitops.h
+++ b/xen/include/asm-x86/bitops.h
@@ -334,7 +334,7 @@ extern unsigned int __find_first_zero_bit(
 extern unsigned int __find_next_zero_bit(
     const unsigned long *addr, unsigned int size, unsigned int offset);
 
-static inline unsigned int __scanbit(unsigned long val, unsigned int max)
+static always_inline unsigned int __scanbit(unsigned long val, unsigned int 
max)
 {
     if ( __builtin_constant_p(max) && max == BITS_PER_LONG )
         alternative_io("bsf %[in],%[out]; cmovz %[max],%k[out]",
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

Reply via email to