---- On Tue, 27 Oct 2020 10:22:42 +0100 Dario Faggioli <dfaggi...@suse.com> 
wrote ----



On Tue, 2020-10-27 at 06:58 +0100, Jürgen Groß wrote: 
> On 26.10.20 17:31, Dario Faggioli wrote: 
> > 
> > Or did you have something completely different in mind, and I'm 
> > missing 
> > it? 
> 
> No, I think you are right. I mixed that up with __context_switch() 
> not 
> being called. 
> 
Right. 
 
> Sorry for the noise, 
> 
Sure, no problem. 
 
In fact, this issue is apparently scheduler independent. It indeed 
seemd to be related to the other report we have "BUG: credit=sched2 
machine hang when using DRAKVUF", but there it looks like it is 
scheduler-dependant. 
 
Might it be something that lies somewhere else, but Credit2 is 
triggering it faster/easier? (Just thinking out loud...) 
 
For Frederic, what happens is that dom0 hangs, right? So you're able to 
poke at Xen with some debugkeys (like 'r' for the scheduler's status, 
and the ones for the domain's vCPUs)? 
 
If yes, it may be useful to see the output. 
 
Regards 
-- 
Dario Faggioli, Ph.D 
http://about.me/dario.faggioli 
Virtualization Software Engineer 
SUSE Labs, SUSE https://www.suse.com/ 
------------------------------------------------------------------- 
<<This happens because _I_ choose it to happen!>> (Raistlin Majere) 






First of all, sorry for the possible duplicates. I had network issue due to 
subsequent freezes (...) while writing to you and Marek has not received my 
previous mails so here the info. 

 

 

To answer your question Dario, yes dom0 hangs totally and VMs too. In the case 
of `sched=credit`, I've succeeded to obtain the output of 'r' debug-keys in 
serial console: 

``` 

(XEN) sched_smt_power_savings: disabled 

(XEN) NOW=72810702614697 

(XEN) Online Cpus: 0-15 

(XEN) Cpupool 0: 

(XEN) Cpus: 0-15 

(XEN) Scheduling granularity: cpu, 1 CPU per sched-resource 

(XEN) Scheduler: SMP Credit Scheduler (credit) 

(XEN) info: 

(XEN)     ncpus              = 16 

(XEN)     master             = 0 

(XEN)     credit             = 4800 

(XEN)     credit balance     = 608 

(XEN)     weight             = 12256 

(XEN)     runq_sort          = 996335 

(XEN)     default-weight     = 256 

(XEN)     tslice             = 30ms 

(XEN)     ratelimit          = 1000us 

(XEN)     credits per msec   = 10 

(XEN)     ticks per tslice   = 3 

(XEN)     migration delay    = 0us 

(XEN) idlers: 00000000,00003c99 

(XEN) active units: 

(XEN)       1: [0.1] pri=-1 flags=0 cpu=6 credit=214 [w=2000,cap=0] 

(XEN)       2: [0.4] pri=-1 flags=0 cpu=8 credit=115 [w=2000,cap=0] 

(XEN)       3: [0.5] pri=-1 flags=0 cpu=5 credit=239 [w=2000,cap=0] 

(XEN)       4: [0.11] pri=-1 flags=0 cpu=1 credit=-55 [w=2000,cap=0] 

(XEN)       5: [0.6] pri=-2 flags=0 cpu=15 credit=-177 [w=2000,cap=0] 

(XEN)       6: [0.7] pri=-1 flags=0 cpu=2 credit=50 [w=2000,cap=0] 

(XEN)       7: [19.1] pri=-2 flags=0 cpu=9 credit=-241 [w=256,cap=0] 

(XEN) CPUs info: 

(XEN) CPU[00] current=d[IDLE]v0, curr=d[IDLE]v0, prev=NULL 

(XEN) CPU[00] nr_run=0, sort=996334, sibling={0}, core={0-7} 

(XEN) CPU[01] current=d0v11, curr=d0v11, prev=NULL 

(XEN) CPU[01] nr_run=1, sort=996335, sibling={1}, core={0-7} 

(XEN)     run: [0.11] pri=-1 flags=0 cpu=1 credit=-55 [w=2000,cap=0] 

(XEN)       1: [32767.1] pri=-64 flags=0 cpu=1 

(XEN) CPU[02] current=d0v7, curr=d0v7, prev=NULL 

(XEN) CPU[02] nr_run=1, sort=996335, sibling={2}, core={0-7} 

(XEN)     run: [0.7] pri=-1 flags=0 cpu=2 credit=50 [w=2000,cap=0] 

(XEN)       1: [32767.2] pri=-64 flags=0 cpu=2 

(XEN) CPU[03] current=d[IDLE]v3, curr=d[IDLE]v3, prev=NULL 

(XEN) CPU[03] nr_run=0, sort=996329, sibling={3}, core={0-7} 

(XEN) CPU[04] current=d[IDLE]v4, curr=d[IDLE]v4, prev=NULL 

(XEN) CPU[04] nr_run=0, sort=996325, sibling={4}, core={0-7} 

(XEN) CPU[05] current=d0v5, curr=d0v5, prev=NULL 

(XEN) CPU[05] nr_run=1, sort=996334, sibling={5}, core={0-7} 

(XEN)     run: [0.5] pri=-1 flags=0 cpu=5 credit=239 [w=2000,cap=0] 

(XEN)       1: [32767.5] pri=-64 flags=0 cpu=5 

(XEN) CPU[06] current=d0v1, curr=d0v1, prev=NULL 

(XEN) CPU[06] nr_run=1, sort=996334, sibling={6}, core={0-7} 

(XEN)     run: [0.1] pri=-1 flags=0 cpu=6 credit=214 [w=2000,cap=0] 

(XEN)       1: [32767.6] pri=-64 flags=0 cpu=6 

(XEN) CPU[07] current=d[IDLE]v7, curr=d[IDLE]v7, prev=NULL 

(XEN) CPU[07] nr_run=0, sort=996303, sibling={7}, core={0-7} 

(XEN) CPU[08] current=d[IDLE]v8, curr=d[IDLE]v8, prev=NULL 

(XEN) CPU[08] nr_run=2, sort=996335, sibling={8}, core={8-15} 

(XEN)       1: [0.4] pri=-1 flags=0 cpu=8 credit=115 [w=2000,cap=0] 

(XEN) CPU[09] current=d19v1, curr=d19v1, prev=NULL 

(XEN) CPU[09] nr_run=1, sort=996335, sibling={9}, core={8-15} 

(XEN)     run: [19.1] pri=-2 flags=0 cpu=9 credit=-241 [w=256,cap=0] 

(XEN)       1: [32767.9] pri=-64 flags=0 cpu=9 

(XEN) CPU[10] current=d[IDLE]v10, curr=d[IDLE]v10, prev=NULL 

(XEN) CPU[10] nr_run=0, sort=996334, sibling={10}, core={8-15} 

(XEN) CPU[11] current=d[IDLE]v11, curr=d[IDLE]v11, prev=NULL 

(XEN) CPU[11] nr_run=0, sort=996331, sibling={11}, core={8-15} 

(XEN) CPU[12] current=d[IDLE]v12, curr=d[IDLE]v12, prev=NULL 

(XEN) CPU[12] nr_run=0, sort=996333, sibling={12}, core={8-15} 

(XEN) CPU[13] current=d[IDLE]v13, curr=d[IDLE]v13, prev=NULL 

(XEN) CPU[13] nr_run=0, sort=996334, sibling={13}, core={8-15} 

(XEN) CPU[14] current=d0v14, curr=d0v14, prev=NULL 

(XEN) CPU[14] nr_run=1, sort=990383, sibling={14}, core={8-15} 

(XEN)     run: [0.14] pri=0 flags=0 cpu=14 credit=-514 [w=2000,cap=0] 

(XEN)       1: [32767.14] pri=-64 flags=0 cpu=14 

(XEN) CPU[15] current=d0v6, curr=d0v6, prev=NULL 

(XEN) CPU[15] nr_run=1, sort=996335, sibling={15}, core={8-15} 

(XEN)     run: [0.6] pri=-2 flags=0 cpu=15 credit=-177 [w=2000,cap=0] 

(XEN)       1: [32767.15] pri=-64 flags=0 cpu=15 

``` 

 

I attempt to get '*' but that blocked my serial console, at least I was not 
able to interact with it few minutes later. I'll try to get other info too. 
I've also uploaded the piece of this huge '*' dump here: 
https://gist.github.com/fepitre/36923fbc08cc2fd8bdb59b81e73a6c2e 

 

Right after, I've restarted with the default value of 'sched' (credit2) and 
just few minutes later I obtained: 

'r': https://gist.github.com/fepitre/78541f555902275d906d627de2420571 

'q': https://gist.github.com/fepitre/0ddf6b5e8fdb3152d24337d83fdc345e 

'I': https://gist.github.com/fepitre/50c68233d08ad1e495edf7e0e146838b 

 

Tell me if I can provide any other info from serial console. 

 

Regards, 

Frédéric

Reply via email to