Wido, Could you tell me the libvirt version? For our platform with this issue, the libvirt version is 0.9.13
-Wei 2013/6/7 Marcus Sorensen <shadow...@gmail.com> > There is already quite a bit of logging around this stuff, for example: > > s_logger.error("deleteStoragePool removed pool from > libvirt, but libvirt had trouble" > + "unmounting the pool. Trying umount > location " + targetPath > + "again in a few seconds"); > > And if it gets an error from libvirt during create stating that the > mountpoint is in use, agent attempts to unmount before remounting. Of > course this would fail if it is in use. > > // if error is that pool is mounted, try to handle it > if (e.toString().contains("already mounted")) { > s_logger.error("Attempting to unmount old mount > libvirt is unaware of at "+targetPath); > String result = Script.runSimpleBashScript("umount " + > targetPath ); > if (result == null) { > s_logger.error("Succeeded in unmounting " + > targetPath); > try { > sp = conn.storagePoolCreateXML(spd.toString(), 0); > s_logger.error("Succeeded in redefining storage"); > return sp; > } catch (LibvirtException l) { > s_logger.error("Target was already mounted, > unmounted it but failed to redefine storage:" + l); > } > } else { > s_logger.error("Failed in unmounting and > redefining storage"); > } > } > > > Do you think it was related to the upgrade process itself (e.g. maybe > the storage pools didn't carry across the libvirt upgrade)? Can you > duplicate outside of the upgrade? > > On Fri, Jun 7, 2013 at 8:43 AM, Wido den Hollander <w...@widodh.nl> wrote: > > Hi, > > > > > > On 06/07/2013 04:30 PM, Marcus Sorensen wrote: > >> > >> Does this only happen with isos? > > > > > > Yes, it does. > > > > My work-around for now was to locate all the Instances who had these ISOs > > attached and detach them from all (~100 instances..) > > > > Then I manually unmounted all the mountpoints under /mnt so that they > can be > > re-used again. > > > > This cluster was upgraded to 4.1 from 4.0 with libvirt 1.0.2 (coming from > > 0.9.8). > > > > Somehow libvirt forgot about these storage pools. > > > > Wido > > > >> On Jun 7, 2013 8:15 AM, "Wido den Hollander" <w...@widodh.nl> wrote: > >> > >>> Hi, > >>> > >>> So, I just created CLOUDSTACK-2893, but Wei Zhou mentioned that there > are > >>> some related issues: > >>> * CLOUDSTACK-2729 > >>> * CLOUDSTACK-2780 > >>> > >>> I restarted my Agent and the issue described in 2893 went away, but I'm > >>> wondering how that happened. > >>> > >>> Anyway, after going further I found that I have some "orphaned" storage > >>> pools, with that I mean, they are mounted and in use, but not defined > nor > >>> active in libvirt: > >>> > >>> root@n02:~# lsof |grep "\.iso"|awk '{print $9}'|cut -d '/' -f 3|sort > >>> -n|uniq > >>> eb3cd8fd-a462-35b9-882a-**f4b9f2f4a84c > >>> f84e51ab-d203-3114-b581-**247b81b7d2c1 > >>> fd968b03-bd11-3179-a2b3-**73def7c66c68 > >>> 7ceb73e5-5ab1-3862-ad6e-**52cb986aff0d > >>> 7dc0149e-0281-3353-91eb-**4589ef2b1ec1 > >>> 8e005344-6a65-3802-ab36-**31befc95abf3 > >>> 88ddd8f5-e6c7-3f3d-bef2-**eea8f33aa593 > >>> 765e63d7-e9f9-3203-bf4f-**e55f83fe9177 > >>> 1287a27d-0383-3f5a-84aa-**61211621d451 > >>> 98622150-41b2-3ba3-9c9c-**09e3b6a2da03 > >>> > >>> root@n02:~# > >>> > >>> Looking at libvirt: > >>> root@n02:~# virsh pool-list > >>> Name State Autostart > >>> ------------------------------**----------- > >>> 52801816-fe44-3a2b-a147-**bb768eeea295 active no > >>> 7ceb73e5-5ab1-3862-ad6e-**52cb986aff0d active no > >>> 88ddd8f5-e6c7-3f3d-bef2-**eea8f33aa593 active no > >>> a83d1100-4ffa-432a-8467-**4dc266c4b0c8 active no > >>> fd968b03-bd11-3179-a2b3-**73def7c66c68 active no > >>> > >>> > >>> root@n02:~# > >>> > >>> What happens here is that the mountpoints are in use (ISO attached to > >>> Instance) but there is no storage pool in libvirt. > >>> > >>> This means that when you try to deploy a second VM with the same ISO > >>> libvirt will error out since the Agent will try to create and start a > new > >>> storage pool which will fail since the mountpoint is already in use. > >>> > >>> The remedy would be to take the hypervisor into maintainence, reboot > int > >>> completely and migrate Instances to it again. > >>> > >>> In libvirt there is no way to start a NFS storage pool without libvirt > >>> mounting it. > >>> > >>> Any suggestions on how we can work around this code wise? > >>> > >>> For my issue I'm writing a patch which adds some more debug lines to > show > >>> what the Agent is doing, but it's kind of weird that we got into this > >>> "disconnected" state. > >>> > >>> Wido > >>> > >> > > >