https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49244

--- Comment #14 from dhowells at redhat dot com <dhowells at redhat dot com> ---
Okay, I built and booted an x86_64 kernel that had the XXX_bit() and
test_and_XXX_bit() ops altered to use __atomic_fetch_YYY() funcs.  The core
kernel ended up ~8K larger in the .text segment.  Examining ext4_resize_begin()
as an example, this statement:

        if (test_and_set_bit_lock(EXT4_RESIZING, &EXT4_SB(sb)->s_resize_flags))
                ret = -EBUSY;

looks like this in the unpatched kernel:

   0xffffffff812169f3 <+122>:   lock btsl $0x0,0x3b8(%rax)
   0xffffffff812169fc <+131>:   jb     0xffffffff81216a02
   0xffffffff812169fe <+133>:   xor    %edx,%edx
   0xffffffff81216a00 <+135>:   jmp    0xffffffff81216a07
   0xffffffff81216a02 <+137>:   mov    $0xfffffff0,%edx
   0xffffffff81216a07 <+142>:   mov    %edx,%eax


and like this in the patched kernel:

   0xffffffff81217414 <+122>:   xor    %edx,%edx
   0xffffffff81217416 <+124>:   lock btsq $0x0,0x3b8(%rax)
   0xffffffff81217420 <+134>:   setb   %dl
   0xffffffff81217423 <+137>:   neg    %edx
   0xffffffff81217425 <+139>:   and    $0xfffffff0,%edx
   0xffffffff81217428 <+142>:   mov    %edx,%eax

So it looks good here at least:-)

This also suggests there's an error in the current x86_64 kernel implementation
as the kernel bitops are supposed to operate on machine word-size locations, so
it should be using BTSQ not BTSL - which would make the __atomic_fetch_or()
variant a byte shorter - and involving no conditional jumps.

Reply via email to