Re: [Xen-devel] Scheduler regression in 4.7

2016-08-11 Thread Dario Faggioli
On Thu, 2016-08-11 at 16:42 +0100, Andrew Cooper wrote: > On 11/08/16 15:28, Dario Faggioli wrote: > > On Thu, 2016-08-11 at 14:39 +0100, Andrew Cooper wrote: > > > It will be IS_RUNQ_IDLE() which is the problem. > > > > > Ok, that does one step of list traversing (the runq). What I didn't > > und

Re: [Xen-devel] Scheduler regression in 4.7

2016-08-11 Thread Andrew Cooper
On 11/08/16 15:28, Dario Faggioli wrote: > On Thu, 2016-08-11 at 14:39 +0100, Andrew Cooper wrote: >> On 11/08/16 14:24, George Dunlap wrote: >>> On 11/08/16 12:35, Andrew Cooper wrote: The actual cause is _csched_cpu_pick() falling over LIST_POISON, which happened to occur at the sa

Re: [Xen-devel] Scheduler regression in 4.7

2016-08-11 Thread Dario Faggioli
On Thu, 2016-08-11 at 14:39 +0100, Andrew Cooper wrote: > On 11/08/16 14:24, George Dunlap wrote: > > On 11/08/16 12:35, Andrew Cooper wrote: > > > The actual cause is _csched_cpu_pick() falling over LIST_POISON, > > > which > > > happened to occur at the same time as a domain was shutting > > > do

Re: [Xen-devel] Scheduler regression in 4.7

2016-08-11 Thread Andrew Cooper
On 11/08/16 14:24, George Dunlap wrote: > On 11/08/16 12:35, Andrew Cooper wrote: >> Hello, >> >> XenServer testing has discovered a regression from recent changes in >> staging-4.7. >> >> The actual cause is _csched_cpu_pick() falling over LIST_POISON, which >> happened to occur at the same time a

Re: [Xen-devel] Scheduler regression in 4.7

2016-08-11 Thread George Dunlap
On 11/08/16 12:35, Andrew Cooper wrote: > Hello, > > XenServer testing has discovered a regression from recent changes in > staging-4.7. > > The actual cause is _csched_cpu_pick() falling over LIST_POISON, which > happened to occur at the same time as a domain was shutting down. The > instructio

[Xen-devel] Scheduler regression in 4.7

2016-08-11 Thread Andrew Cooper
Hello, XenServer testing has discovered a regression from recent changes in staging-4.7. The actual cause is _csched_cpu_pick() falling over LIST_POISON, which happened to occur at the same time as a domain was shutting down. The instruction in question is `mov 0x10(%rax),%rax` which looks like