[Bug 1766543] Re: instance deletion takes a while and blocks nova-compute

2018-07-26 Thread Junien Fridrick
apport information ** Tags added: apport-collected uec-images xenial ** Description changed: Hi, I have a cloud running xenial/mitaka (with 18.02 charms). Sometimes, an instance will take minutes to delete. I tracked down the time taken to be file deletion : Apr 23 07:23:00 ho

[Bug 1766543] Re: instance deletion takes a while and blocks nova-compute

2018-07-26 Thread Junien Fridrick
By the way the task blocks on the following : ==> /proc/54255/task/54255/stack <== [] wakeup_preempt_entity.isra.6+0x7c/0x90 [] __switch_to+0x1f8/0x350 [] get_request+0x29c/0x910 [] blk_queue_bio+0x164/0x500 [] generic_make_request+0x154/0x310 [] submit_bio+0xd4/0x1f0 [] ext4_io_submit+0x7c/0xb0 [

[Bug 1766543] Re: instance deletion takes a while and blocks nova-compute

2018-07-26 Thread James Page
Raising a task for the kernel for an opinion on how to triage this problem further; Marking the nova task as Medium but leaving as New for now until we figure out what's going on - this is not a show stopped but is a pain. ** Also affects: linux (Ubuntu) Importance: Undecided Status: New

[Bug 1766543] Re: instance deletion takes a while and blocks nova-compute

2018-04-26 Thread Corey Bryant
Thanks for continuing to dig into this. Comparing your nova-compute logs and the strace unlink details, the timestamps do seem to point to unlink causing the delay. Pasting the strace unlink output below inline with the nova-compute logs in order of occurrence: nova-compute log: 2018-04-25 14:48:

[Bug 1766543] Re: instance deletion takes a while and blocks nova-compute

2018-04-26 Thread Junien Fridrick
OK, the bug happened again with strace attached to nova-compute. Once again, there's little to no IO/network while it happens. memory is stable. CPU is at least 50% idle (and the rest of it largely user mode). Nothing in dmesg. nova-compute logs are as follow : 2018-04-25 14:48:04.587 54255 INFO n

[Bug 1766543] Re: instance deletion takes a while and blocks nova-compute

2018-04-24 Thread Junien Fridrick
Hi Corey, * I assume nova-compute is up the whole time because its PID doesn't change * I'm running with debug on * I'm going to try getting more details with strace, but the problem is I can't repro the problem - it just sometimes happen... No luck on repro-ing on queens so far, but the cloud i

[Bug 1766543] Re: instance deletion takes a while and blocks nova-compute

2018-04-24 Thread Corey Bryant
Hi Junien, Thanks for reporting this. It would be great to see if it's reproducible on queens. A couple of questions: * Is nova-compute actually up the whole time this is happening? * Are you running with debug on? * Could we get any more details by attaching strace to nova-compute? Thanks, Core

[Bug 1766543] Re: instance deletion takes a while and blocks nova-compute

2018-04-24 Thread Junien Fridrick
Also, nova-scheduler or nova-api-os-compute will log the following lines (a few times per minute) while this is happening : Apr 23 07:24:47 juju-8c74e6-4-lxd-7 nova-scheduler[15786]: 2018-04-23 07:24:47.785 15786 DEBUG nova.servicegroup.drivers.db [req- 1573c400-116c-4825-b108-3291a014b0e9 bc0ab05