## VIVID from kernel/sched/core.c
1064 #ifdef CONFIG_SMP 1065 void set_task_cpu(struct task_struct *p, unsigned int new_cpu) 1066 { ... 1093 if (task_cpu(p) != new_cpu) { 1094 struct task_migration_notifier tmn; 1095 1096 if (p->sched_class->migrate_task_rq) 1097 p->sched_class->migrate_task_rq(p, new_cpu); 1098 p->se.nr_migrations++; 1099 perf_sw_event(PERF_COUNT_SW_CPU_MIGRATIONS, 1, NULL, 0); 1100 1101 tmn.task = p; 1102 tmn.from_cpu = task_cpu(p); 1103 tmn.to_cpu = new_cpu; 1104 1105 atomic_notifier_call_chain(&task_migration_notifier, 0, &tmn); 1106 } 1107 1108 __set_task_cpu(p, new_cpu); 1109 } ## WILY from include/linux/perf_event.h: 836 static inline void perf_event_task_sched_in(struct task_struct *prev, 837 struct task_struct *task) 838 { 839 if (static_key_false(&perf_sched_events.key)) 840 __perf_event_task_sched_in(prev, task); 841 842 if (perf_sw_migrate_enabled() && task->sched_migrated) { 843 struct pt_regs *regs = this_cpu_ptr(&__perf_regs[0]); 844 845 perf_fetch_caller_regs(regs); 846 ___perf_sw_event(PERF_COUNT_SW_CPU_MIGRATIONS, 1, regs, 0); 847 task->sched_migrated = 0; 848 } 849 } ---- Checking how recent kernels incremented PERF_COUNT_SW_CPU_MIGRATIONS I saw there was a difference from Vivid. While in Vivid, PERF_COUNT_SW_CPU_MIGRATIONS was being incremented directly from set_task_cpu (and that is why we asked for tracing of this function), there was a commit that changed that behavior alleging software migrate events were being accounted in a wrong way. Instead of changing PERF SW counter right inside set_task_cpu(), it would mark the task as "migrated" (using task_struct) and, later, when context_switch() calls finish_task_switch(), if the task was marked as "migrated", then the PERF SW counter will be incremented. This change fixes 2 issues: 1) The migration didn't occur yet, since the task wasn't scheduled (yet), just migrated. 2) Migrations that happen from softirq context were accounted in the interrupted process (possible as migrations that never happened). Commit: commit ff303e66c240ba6269e31817a386995440a18c99 Author: Peter Zijlstra <pet...@infradead.org> Date: Fri Apr 17 20:05:30 2015 +0200 perf: Fix software migrate events Stephane asked about PERF_COUNT_SW_CPU_MIGRATIONS and I realized it was borken: > The problem is that the task isn't actually scheduled while its being > migrated (obviously), and if its not scheduled, the counters aren't > scheduled either, so there's no observing of the fact. > > A further problem with migrations is that many migrations happen from > softirq context, which is nested inside the 'random' task context of > whoemever happens to run at that time, similarly for the wakeup > migrations triggered from (soft)irq context. All those end up being > accounted in the task that's currently running, eg. your 'ls'. The below cures this by marking a task as migrated and accounting it on the subsequent sched_in(). Signed-off-by: Peter Zijlstra (Intel) <pet...@infradead.org> It first appeared in v4.2-rc1. For now, packages: linux-image-4.2.0-36-generic linux-image-extra-4.2.0-36-generic Will probably mitigate the issue. -- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1575407 Title: Trusty + 3.19 (lts-vivid) PERF wrong cpu-migration counter Status in linux package in Ubuntu: Confirmed Bug description: It was brought to my attention that: In a PowerPC based server, PERF seems to acuse cpu-migrations when only a single cpu is activated. ## perf Performance counter stats for 'CPU(s) 0': 15027.888988 task-clock (msec) # 1.000 CPUs utilized [100.00%] 25,206 context-switches # 0.002 M/sec [100.00%] 3,518 cpu-migrations # 0.234 K/sec [100.00%] 639 page-faults # 0.043 K/sec 41,545,780,384 cycles # 2.765 GHz [66.68%] 2,868,753,319 stalled-cycles-frontend # 6.91% frontend cycles idle [50.01%] 30,162,193,535 stalled-cycles-backend # 72.60% backend cycles idle [50.01%] 11,161,722,533 instructions # 0.27 insns per cycle # 2.70 stalled cycles per insn [66.68%] 1,544,072,679 branches # 102.747 M/sec [49.99%] 52,536,867 branch-misses # 3.40% of all branches [49.99%] 15.027768835 seconds time elapsed ## lscpu Architecture: ppc64le Byte Order: Little Endian CPU(s): 128 On-line CPU(s) list: 0 Off-line CPU(s) list: 1-127 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 1 NUMA node(s): 2 Model: 8335-GCA L1d cache: 64K L1i cache: 32K L2 cache: 512K L3 cache: 8192K NUMA node0 CPU(s): 0 NUMA node8 CPU(s): So either task migrations are being done to offline cpus or perf is accounting it wrong. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1575407/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp