Hello Christian, thank you very much for your detailed response.

>The default value for this is PARALLEL_SHUTDOWN=10 so everybody would run into 
>this issue.
>I assume that there needs to be more to this than just "broken in general", so 
>let us try to find what it is that makes this fail for you.

These were exactly my thoughts when I encountered the bug.

>While certainly broken and needing a fix this should at least still time out 
>for you after the >default of 2 minutes right?
>You could lessen the timeout as the most convenient until a proper fix is 
>there then.

Actually no, that was my first guess too and I turned down the time-out.
But what was actually happening was that since it failed to shut down
the VMs, the check_guests_shutdown() got called repeatedly, thereby
adding more error messages to the list of VMs to shut down and so on. So
it actually never timed out because the list of VMs only grew longer.

>I wondered that for me "check_guests_shutdown" is on a different line (353) 
>then.
>That might just be a type or such, but to be sure could you check with verify 
>if the package thinks the file is non default (after>you remove your 
>modification of course):

I'm pretty sure the file is default, it's propably an empty line
somewhere from when I started debugging, but I will check on that right
away. I have also already tried downloading the newest version from
upstream, but you are right in that it remained pretty much the same
(and that script also did not work).

>Also the issue only occurs if function guest_is_on fails (so neither detected 
>run, nor not running, but really failing). Eventually that executes:
>$ virsh domname <uuid>
>That should also fail in your case to trigger the issue - is there any obvious 
>reason you'd know why that fails for you? The output of this should also be 
>mixed into the result in your case, so maybe you find it there.

Hmm, initally I thought that was just a very bad way of checking if the VM was 
still running, but upon closer inspection you are right. But when I manually 
run something like "virsh domname $uuid" it gives me the domname as output, so 
it seems to work fine. What might cause trouble here is that these VMs are 
'transient', i.e. they do not keep their UUID after shutdown. That would 
explain why it can't check whether or not the VM has been shut down. I only 
know this because when I choose "suspend" as value for "ON_SHUTDOWN", it tells 
me that 'transient VMs can't be suspended".
Maybe I should mention that I run libvirt with opennebula, which basically puts 
a nice interface to KVM to manage VMs, so the transient thing comes from there.

>Would you mind as being the one who found it to report the issue there
and linking the bug or mailing list entry here to help tracking the
discussion there?

No problem, I will do that.

>It would be great if you could try this diff on your file and see if it 
>resolves your issues as well.
Sadly the patch did not help, altough it changed the faulty behaviour, yay! Now 
I get repeated output looking like this:

sudo ./libvirt-guests.sh stop

Running guests on default URI: one-44, one-38
Shutting down guests on default URI...
Starting shutdown on guest: one-44
Starting shutdown on guest: one-38
Waiting for 2 guests to shut down, 120 seconds left
Starting shutdown on guest: one-44
Starting shutdown on guest: one-38
Starting shutdown on guest: one-44
Starting shutdown on guest: one-38
Starting shutdown on guest: one-44
Starting shutdown on guest: one-38
Failed to determine state of guest: 6cffc2fe-c2b3-4f54-b8e0-054f70453294. Not 
tracking it anymore.
Failed to determine state of guest: 6cffc2fe-c2b3-4f54-b8e0-054f70453294. Not 
tracking it anymore.
Failed to determine state of guest: 6cffc2fe-c2b3-4f54-b8e0-054f70453294. Not 
tracking it anymore.
Failed to determine state of guest: 6cffc2fe-c2b3-4f54-b8e0-054f70453294. Not 
tracking it anymore.
Shutdown of guest  complete.
Shutdown of guest  complete.
Shutdown of guest  complete.
Shutdown of guest  complete.
Starting shutdown on guest: 
error: failed to get domain '6cffc2fe-c2b3-4f54-b8e0-054f70453294'
error: Domain not found: no domain with matching name 
'6cffc2fe-c2b3-4f54-b8e0-054f70453294'
Starting shutdown on guest: one-38
Starting shutdown on guest: 
error: failed to get domain '6cffc2fe-c2b3-4f54-b8e0-054f70453294'
error: Domain not found: no domain with matching name 
'6cffc2fe-c2b3-4f54-b8e0-054f70453294'
Starting shutdown on guest: one-38
Starting shutdown on guest: 
error: failed to get domain '6cffc2fe-c2b3-4f54-b8e0-054f70453294'
error: Domain not found: no domain with matching name 
'6cffc2fe-c2b3-4f54-b8e0-054f70453294'
Starting shutdown on guest: one-38
Failed to determine state of guest: 6cffc2fe-c2b3-4f54-b8e0-054f70453294. Not 
tracking it anymore.
Failed to determine state of guest: 6cffc2fe-c2b3-4f54-b8e0-054f70453294. Not 
tracking it anymore.
Failed to determine state of guest: 6cffc2fe-c2b3-4f54-b8e0-054f70453294. Not 
tracking it anymore.
Shutdown of guest  complete.
Shutdown of guest  complete.
Shutdown of guest  complete.

And so on and so on with no sign of stopping. Upon inspection with "set
-x" set, it seems that the 2 guests are added again to the list of
guests to shut down, so there are sometimes 3 or 4 VMs (basically
duplicated UUIDs) in the list.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1688508

Title:
  libvirt-guests.sh fails to shutdown guests in parallel

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1688508/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs

Reply via email to