Currently, smp_processor_id() is used to fetch the current cpu in cpu_idle_loop. Everytime the idle thread runs, it fetches current cpu using smp_processor_id(). For idle thread which is per cpu, current cpu is constant and cannot change at runtime. So moving the smp_processor_id() before the loop saves execution cycles/time in loop. With the patch, assembly code(on x-86 and ARM64) to be executed in loop is reduced. X-86 architecture: Before patch(execution in loop): 148: 0f ae e8 lfence 14b: 65 8b 04 25 00 00 00 00 mov %gs:0x0,%eax 152: 00 153: 89 c0 mov %eax,%eax 155: 49 0f a3 04 24 bt %rax,(%r12)
After patch(execution in loop): 150: 0f ae e8 lfence 153: 4d 0f a3 34 24 bt %r14,(%r12) For ARM64: Before patch(execution in loop): 168: d5033d9f dsb ld 16c: b9405661 ldr w1,[x19,#84] 170: 1100fc20 add w0,w1,#0x3f 174: 6b1f003f cmp w1,wzr 178: 1a81b000 csel w0,w0,w1,lt 17c: 130c7000 asr w0,w0,#6 180: 937d7c00 sbfiz x0,x0,#3,#32 184: f8606aa0 ldr x0,[x21,x0] 188: 9ac12401 lsr x1,x0,x1 18c: 36000e61 tbz w1,#0,358 After patch(execution in loop): 1a8: d50339df dsb ld 1ac: f8776ac0 ldr x0,[x22,x23] ab0: ea18001f tst x0,x24 1b4: 54000ea0 b.eq 388 Further observance on ARM64 for 4 seconds shows that cpu_idle_loop is called 8672 times. Shifting the code will save instructions executed in loop and eventually time as well. Signed-off-by: Gaurav Jindal<gaurav.jin...@spreadtrum.com> Reviewed-by: Sanjeev Yadav<sanjeev.ya...@spreadtrum.com> --- diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index 1214f0a..82698e5 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -185,6 +185,8 @@ exit_idle: */ static void cpu_idle_loop(void) { + int cpu_id; + cpu_id = smp_processor_id(); while (1) { /* * If the arch has a polling bit, we maintain an invariant: @@ -202,7 +204,7 @@ static void cpu_idle_loop(void) check_pgt_cache(); rmb(); - if (cpu_is_offline(smp_processor_id())) + if (cpu_is_offline(cpu_id)) arch_cpu_idle_dead(); local_irq_disable(); --