I have traced all execution paths plausible to increment "perf_count_sw_cpu_migrations" (or (task struct *)p->se.nr_migrations++). Which would show us that the task migrated from different CPUs. ____________________________________________________________
(Execution Path 1 - Kernel changes 2 tasks from 2 different CPUs) [execution interrupted by page fault interruption] handle_pte_fault -> numa_migrate_preferred -> task_numa_migrate -> migrate_swap -> stop_two_cpus ... (scheduling task migration state machine) <asynchronous handler on cpu IPIs> [kernel migration thread handles the task switch] [you have 1 migration thread per cpu running] [new cpu comes from numa balance/smp logic] migrate_swap_stop -> migrate_swap_task -> (kernel swaps 2 tasks from different cpus) set_task_cpu -> **** update to perf counter **** [process scheduled in a new cpu] ____________________________________________________________ (Execution Path 2 - Similar to 1 but, instead of swapping, it sends the task) [execution interrupted by page fault interruption] handle_pte_fault -> migrate_task_to -> stop_one_cpu ...(scheduling task submission to another cpu) <asynchronous handler on cpu IPIs> [kernel migration thread handles the task switch] [you have 1 migration thread per cpu running] [new cpu comes from numa balance/smp logic] migration_cpu_stop -> migrate_task -> (move task from one cpu to another) move_queued_task -> set_task_cpu -> **** update to perf counter **** [process scheduled in a new cpu] ____________________________________________________________ (Execution Path 3 - New executions) [fork / exec] sched_exec -> stop_one_cpu... (scheduling task submission to another cpu) <asynchronous handler on cpu IPIs> [kernel migration thread handles the task switch] [you have 1 migration thread per cpu running] [new cpu comes from scheduler_class - fair/deadline/rt - select_task_rq logic] [new cpu can also come from select_fallback_rq] --> fallback might not take cpumask in consideration migration_cpu_stop -> migrate_task -> (move task from one cpu to another) move_queued_task -> set_task_cpu -> **** update to perf counter **** [process scheduled in a new cpu] &&&&&& (Execution Path 4 - Regular Scheduling) [wake up process] [wake up state] try_to_wake_up -> [new cpu comes from scheduler_class - fair/deadline/rt - select_task_rq logic] [new cpu can also come from select_fallback_rq] --> fallback might not take cpumask in consideration select_task_rq -> set_task_cpu **** update to perf counter **** [process scheduled in a new cpu] ****** note for execution paths 3 & 4 ******** -> select_fallback_rq is responsible for the messages: [255688.556945] process 1 (init) no longer affine to cpu1 [266710.938490] process 1 (init) no longer affine to cpu1 [275071.280189] process 1 (init) no longer affine to cpu1 [286088.372647] process 1 (init) no longer affine to cpu1 [355886.470777] process 1 (init) no longer affine to cpu1 [358415.046246] process 1 (init) no longer affine to cpu1 from the dmesg. It shows us that the fallback mechanism of picking the cpu run queue was used. Fallback mechanism might be doing something wrong. ______________________________ PS: There are a few others paths coming from deadline & realtime schedulers not shown here. My idea is to get user & kernel stack traces on a probe to "set_task_cpu". This will tell us if it is being called, by which function and if all calls are coming from the same execution path (like coming from select_fallback_rq instead of (p->sched_class->select_task_rq() functions from fair scheduler). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1575407 Title: Trusty + 3.19 (lts-vivid) PERF wrong cpu-migration counter To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1575407/+subscriptions -- ubuntu-bugs mailing list [email protected] https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
