I had seen something similar related to the KVM HA monitor (it would
re-mount the pools outside of libvirt after they were removed), but
anything using getStoragePoolByURI to register a pool shouldn't be
added to the KVMHA monitor anymore. That HA monitor script is the only
way I know of that cloudstack mounts NFS outside of libvirt, so it
seems that the issue is in removing the mountpoint while it is in use.
 Libvirt will remove it from the definition, even if it can't be
unmounted, so perhaps there's an issue in verifying that the
mountpoint isn't in use before trying to delete the storage pool.

I am assuming when you say 'in use' that it means that the ISO is
connected to a VM. However, this could happen for any number of
reasons... say an admin is looking in the directory right when
cloudstack wants to delete the storage pool from libvirt.

On Fri, Jun 7, 2013 at 8:30 AM, Marcus Sorensen <shadow...@gmail.com> wrote:
> Does this only happen with isos?
>
> On Jun 7, 2013 8:15 AM, "Wido den Hollander" <w...@widodh.nl> wrote:
>>
>> Hi,
>>
>> So, I just created CLOUDSTACK-2893, but Wei Zhou mentioned that there are
>> some related issues:
>> * CLOUDSTACK-2729
>> * CLOUDSTACK-2780
>>
>> I restarted my Agent and the issue described in 2893 went away, but I'm
>> wondering how that happened.
>>
>> Anyway, after going further I found that I have some "orphaned" storage
>> pools, with that I mean, they are mounted and in use, but not defined nor
>> active in libvirt:
>>
>> root@n02:~# lsof |grep "\.iso"|awk '{print $9}'|cut -d '/' -f 3|sort
>> -n|uniq
>> eb3cd8fd-a462-35b9-882a-f4b9f2f4a84c
>> f84e51ab-d203-3114-b581-247b81b7d2c1
>> fd968b03-bd11-3179-a2b3-73def7c66c68
>> 7ceb73e5-5ab1-3862-ad6e-52cb986aff0d
>> 7dc0149e-0281-3353-91eb-4589ef2b1ec1
>> 8e005344-6a65-3802-ab36-31befc95abf3
>> 88ddd8f5-e6c7-3f3d-bef2-eea8f33aa593
>> 765e63d7-e9f9-3203-bf4f-e55f83fe9177
>> 1287a27d-0383-3f5a-84aa-61211621d451
>> 98622150-41b2-3ba3-9c9c-09e3b6a2da03
>> root@n02:~#
>>
>> Looking at libvirt:
>> root@n02:~# virsh pool-list
>> Name                 State      Autostart
>> -----------------------------------------
>> 52801816-fe44-3a2b-a147-bb768eeea295 active     no
>> 7ceb73e5-5ab1-3862-ad6e-52cb986aff0d active     no
>> 88ddd8f5-e6c7-3f3d-bef2-eea8f33aa593 active     no
>> a83d1100-4ffa-432a-8467-4dc266c4b0c8 active     no
>> fd968b03-bd11-3179-a2b3-73def7c66c68 active     no
>>
>> root@n02:~#
>>
>> What happens here is that the mountpoints are in use (ISO attached to
>> Instance) but there is no storage pool in libvirt.
>>
>> This means that when you try to deploy a second VM with the same ISO
>> libvirt will error out since the Agent will try to create and start a new
>> storage pool which will fail since the mountpoint is already in use.
>>
>> The remedy would be to take the hypervisor into maintainence, reboot int
>> completely and migrate Instances to it again.
>>
>> In libvirt there is no way to start a NFS storage pool without libvirt
>> mounting it.
>>
>> Any suggestions on how we can work around this code wise?
>>
>> For my issue I'm writing a patch which adds some more debug lines to show
>> what the Agent is doing, but it's kind of weird that we got into this
>> "disconnected" state.
>>
>> Wido

Reply via email to