Le 26/04/2021 à 20:50, Nathan Chancellor a écrit :
On Sat, Mar 20, 2021 at 11:22:27PM +1100, Michael Ellerman wrote:
From: Christophe Leroy <christophe.le...@csgroup.eu>

call_do_irq() and call_do_softirq() are simple enough to be
worth inlining.

Inlining them avoids an mflr/mtlr pair plus a save/reload on stack. It
also allows GCC to keep the saved ksp_limit in an nonvolatile reg.

This is inspired from S390 arch. Several other arches do more or
less the same. The way sparc arch does seems odd thought.

Signed-off-by: Christophe Leroy <christophe.le...@csgroup.eu>
Signed-off-by: Michael Ellerman <m...@ellerman.id.au>


This change caused our ppc44x_defconfig builds to hang when powering
down in QEMU:

https://github.com/ClangBuiltLinux/continuous-integration2/runs/2304364629?check_suite_focus=true#logs

This is probably something with clang given that GCC 10.3.0 works fine
but due to the nature of the change, I have no idea how to tell what is
going wrong. I tried to do some rudimentary debugging with gdb but that
did not really get me anywhere.

The kernel was built with just 'CC=clang' and it is reproducible with
all versions of clang that the kernel supports.

The QEMU invocation is visible at the link above, it is done with our
boot-qemu.sh in this repo, which also houses the rootfs:

https://github.com/ClangBuiltLinux/boot-utils

Happy to provide any other information or debug/test as directed!


With GCC:

000003f0 <do_softirq_own_stack>:
 3f0:   94 21 ff f0     stwu    r1,-16(r1)
 3f4:   7c 08 02 a6     mflr    r0
 3f8:   3d 20 00 00     lis     r9,0
                        3fa: R_PPC_ADDR16_HA    .data..read_mostly+0x4
 3fc:   93 e1 00 0c     stw     r31,12(r1)
 400:   90 01 00 14     stw     r0,20(r1)
 404:   83 e9 00 00     lwz     r31,0(r9)
                        406: R_PPC_ADDR16_LO    .data..read_mostly+0x4
 408:   94 3f 1f f0     stwu    r1,8176(r31)
 40c:   7f e1 fb 78     mr      r1,r31
 410:   48 00 00 01     bl      410 <do_softirq_own_stack+0x20>
                        410: R_PPC_REL24        __do_softirq
 414:   80 21 00 00     lwz     r1,0(r1)
 418:   80 01 00 14     lwz     r0,20(r1)
 41c:   83 e1 00 0c     lwz     r31,12(r1)
 420:   38 21 00 10     addi    r1,r1,16
 424:   7c 08 03 a6     mtlr    r0
 428:   4e 80 00 20     blr


With CLANG:

000003e8 <do_softirq_own_stack>:
 3e8:   94 21 ff f0     stwu    r1,-16(r1)
 3ec:   93 c1 00 08     stw     r30,8(r1)
 3f0:   3c 60 00 00     lis     r3,0
                        3f2: R_PPC_ADDR16_HA    softirq_ctx
 3f4:   83 c3 00 00     lwz     r30,0(r3)
                        3f6: R_PPC_ADDR16_LO    softirq_ctx
 3f8:   94 3e 1f f0     stwu    r1,8176(r30)
 3fc:   7f c1 f3 78     mr      r1,r30
 400:   48 00 00 01     bl      400 <do_softirq_own_stack+0x18>
                        400: R_PPC_REL24        __do_softirq
 404:   80 21 00 00     lwz     r1,0(r1)
 408:   83 c1 00 08     lwz     r30,8(r1)
 40c:   38 21 00 10     addi    r1,r1,16
 410:   4e 80 00 20     blr


As you can see, CLANG doesn't save/restore 'lr' allthought 'lr' is explicitely listed in the registers clobbered by the inline assembly:

>> +static __always_inline void call_do_softirq(const void *sp)
>> +{
>> +      /* Temporarily switch r1 to sp, call __do_softirq() then restore r1. 
*/
>> +      asm volatile (
>> +               PPC_STLU "        %%r1, %[offset](%[sp])  ;"
>> +              "mr                %%r1, %[sp]             ;"
>> +              "bl                %[callee]               ;"
>> +               PPC_LL "  %%r1, 0(%%r1)           ;"
>> +               : // Outputs
>> +               : // Inputs
>> +                 [sp] "b" (sp), [offset] "i" (THREAD_SIZE - 
STACK_FRAME_OVERHEAD),
>> +                 [callee] "i" (__do_softirq)
>> +               : // Clobbers
>> +                 "lr", "xer", "ctr", "memory", "cr0", "cr1", "cr5", "cr6",
>> +                 "cr7", "r0", "r3", "r4", "r5", "r6", "r7", "r8", "r9", 
"r10",
>> +                 "r11", "r12"
>> +      );

Reply via email to