With verbose my numad log file is:

Mon Jun 17 06:22:53 2019: Nodes: 2
Min CPUs free: 1416, Max CPUs: 1423, Avg CPUs: 1419, StdDev: 3.53553
Min MBs free: 12869, Max MBs: 13756, Avg MBs: 13312, StdDev: 443.5
Node 0: MBs_total 65266, MBs_free  12869, CPUs_total 2000, CPUs_free 1416,  
Distance: 10 40  CPUs: 0,4,8,12,16,20,24,28,32,36,40,44,48,52,56,60,64,68,72,76
Node 1: MBs_total 65337, MBs_free  13756, CPUs_total 2000, CPUs_free 1423,  
Distance: 40 10  CPUs: 
80,84,88,92,96,100,104,108,112,116,120,124,128,132,136,140,144,148,152,156
Mon Jun 17 06:22:53 2019: Processes: 1563
Mon Jun 17 06:22:53 2019: Candidates: 2
101867853: PID 120072: (qemu-system-ppc), Threads 23, MBs_size  55763, MBs_used 
 50509, CPUs_used  876, Magnitude 44245884, Nodes: 0,8
101867853: PID 120206: (qemu-system-ppc), Threads 23, MBs_size  55821, MBs_used 
 23699, CPUs_used  279, Magnitude 6612021, Nodes: 0,8
Mon Jun 17 06:22:53 2019: Advising pid 120072 (qemu-system-ppc) move from nodes 
(0,8) to nodes (0,8)

With debug the dying message looked like:

Another run #2:
Mon Jun 17 06:25:08 2019: Nodes: 2
Min CPUs free: 302, Max CPUs: 439, Avg CPUs: 370, StdDev: 68.5018
Min MBs free: 1597, Max MBs: 4548, Avg MBs: 3072, StdDev: 1475.5
Node 0: MBs_total 65266, MBs_free   1597, CPUs_total 2000, CPUs_free  302,  
Distance: 10 40  CPUs: 0,4,8,12,16,20,24,28,32,36,40,44,48,52,56,60,64,68,72,76
Node 1: MBs_total 65337, MBs_free   4548, CPUs_total 2000, CPUs_free  439,  
Distance: 40 10  CPUs: 
80,84,88,92,96,100,104,108,112,116,120,124,128,132,136,140,144,148,152,156
Mon Jun 17 06:25:08 2019: Processes: 1572
Mon Jun 17 06:25:08 2019: Candidates: 2
101881395: PID 120072: (qemu-system-ppc), Threads 25, MBs_size  55763, MBs_used 
 50523, CPUs_used 1995, Magnitude 100793385, Nodes: 0,8
101881395: PID 120206: (qemu-system-ppc), Threads 25, MBs_size  55821, MBs_used 
 45916, CPUs_used  830, Magnitude 38110280, Nodes: 0,8
Mon Jun 17 06:25:08 2019: PICK NODES FOR:  PID: 120072,  CPUs 2347,  MBs 59438
Mon Jun 17 06:25:08 2019: PROCESS_MBs[0]: 17481
Mon Jun 17 06:25:08 2019:     Node[0]: mem: 201700  cpu: 5952
Mon Jun 17 06:25:08 2019:     Node[1]: mem: 45480  cpu: 2634
Mon Jun 17 06:25:08 2019: Totmag[0]: 12080055
Mon Jun 17 06:25:08 2019: Totmag[1]: 1948267
Mon Jun 17 06:25:08 2019: best_node_ix: 0
Mon Jun 17 06:25:08 2019: Node: 0  Dist: 10  Magnitude: 1200518400
Mon Jun 17 06:25:08 2019: Node: 8  Dist: 40  Magnitude: 119794320
Mon Jun 17 06:25:08 2019: MBs: 59438,  CPUs: 2347
Mon Jun 17 06:25:08 2019: Assigning resources from node 0
Mon Jun 17 06:25:08 2019:     Node[0]: mem: 1000  cpu: 0
Mon Jun 17 06:25:08 2019: MBs: 39368,  CPUs: 1355
Mon Jun 17 06:25:08 2019: Assigning resources from node 1
Mon Jun 17 06:25:08 2019: Advising pid 120072 (qemu-system-ppc) move from nodes 
(0,8) to nodes (0,8)

Another run #3:
Mon Jun 17 06:26:46 2019: Nodes: 2
Min CPUs free: 889, Max CPUs: 1048, Avg CPUs: 968, StdDev: 79.5016
Min MBs free: 1291, Max MBs: 3484, Avg MBs: 2387, StdDev: 1096.5
Node 0: MBs_total 65266, MBs_free   1291, CPUs_total 2000, CPUs_free  889,  
Distance: 10 40  CPUs: 0,4,8,12,16,20,24,28,32,36,40,44,48,52,56,60,64,68,72,76
Node 1: MBs_total 65337, MBs_free   3484, CPUs_total 2000, CPUs_free 1048,  
Distance: 40 10  CPUs: 
80,84,88,92,96,100,104,108,112,116,120,124,128,132,136,140,144,148,152,156
Mon Jun 17 06:26:46 2019: Processes: 1546
Mon Jun 17 06:26:46 2019: Candidates: 2
101891156: PID 120072: (qemu-system-ppc), Threads 23, MBs_size  55763, MBs_used 
 50593, CPUs_used 1437, Magnitude 72702141, Nodes: 0,8
101891156: PID 120206: (qemu-system-ppc), Threads 23, MBs_size  55821, MBs_used 
 48065, CPUs_used  613, Magnitude 29463845, Nodes: 0,8
Mon Jun 17 06:26:46 2019: PICK NODES FOR:  PID: 120072,  CPUs 1690,  MBs 59521
Mon Jun 17 06:26:46 2019: PROCESS_MBs[0]: 17527
Mon Jun 17 06:26:46 2019:     Node[0]: mem: 199130  cpu: 8316
Mon Jun 17 06:26:46 2019:     Node[1]: mem: 34840  cpu: 6288
Mon Jun 17 06:26:46 2019: Totmag[0]: 16559650
Mon Jun 17 06:26:46 2019: Totmag[1]: 2190739
Mon Jun 17 06:26:46 2019: best_node_ix: 0
Mon Jun 17 06:26:46 2019: Node: 0  Dist: 10  Magnitude: 1655965080
Mon Jun 17 06:26:46 2019: Node: 8  Dist: 40  Magnitude: 219073920
Mon Jun 17 06:26:46 2019: MBs: 59521,  CPUs: 1690
Mon Jun 17 06:26:46 2019: Assigning resources from node 0
Mon Jun 17 06:26:46 2019:     Node[0]: mem: 1000  cpu: 0
Mon Jun 17 06:26:46 2019: MBs: 39708,  CPUs: 304
Mon Jun 17 06:26:46 2019: Assigning resources from node 1
Mon Jun 17 06:26:46 2019: Advising pid 120072 (qemu-system-ppc) move from nodes 
(0,8) to nodes (0,8)


Your crash was around:
Thu Feb 21 00:12:10 2019: Assigning resources from node 5
Thu Feb 21 00:12:10 2019: Assigning resources from node 2
Thu Feb 21 00:12:10 2019: Process 88781 already 100 percent localized to target 
nodes.

Mine seems to be as soon as it hits "Assigning resources" as well.
This is something the daemon will do anyway, but obviously more often with 
actual memory load.
So far all fits together, lets try to find what it accesses when failing.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1832915

Title:
  numad crashes while running kvm guest

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu-power-systems/+bug/1832915/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to