Hi,
My lab ACS server (version 4.11.2.0) recently starts to die off a few hours
after a restart, with following error message in the log:
2019-04-24 10:38:35,237 INFO [c.c.s.ConfigurationServerImpl] (main:null)
Processing updateKeyPairs
2019-04-24 10:38:35,237 INFO [c.c.s.ConfigurationServerImpl] (main:null)
Keypairs already in database, updating local copy
2019-04-24 10:38:35,241 INFO [c.c.s.ConfigurationServerImpl] (main:null) Going
to update systemvm iso with generated keypairs if needed
2019-04-24 10:38:35,241 INFO [c.c.s.ConfigurationServerImpl] (main:null)
Trying to inject public and private keys into systemvm iso
2019-04-24 10:38:35,288 INFO [c.c.s.ConfigurationServerImpl] (main:null)
Injected public and private keys into systemvm iso with result : mount: could
not find any free loop device
2019-04-24 10:38:35,288 WARN [c.c.s.ConfigurationServerImpl] (main:null)
Failed to inject generated public key into systemvm iso mount: could not find
any free loop device
2019-04-24 10:38:35,290 WARN [o.a.c.s.m.c.ResourceApplicationContext]
(main:null) Exception encountered during context initialization - cancelling
refresh attempt: org.springframework.context.ApplicationContextException:
Failed to start bean 'cloudStackLifeCycle'; nested exception is
com.cloud.utils.exception.CloudRuntimeException: Failed to inject generated
public key into systemvm iso mount: could not find any free loop device
2019-04-24 10:38:35,291 WARN [o.e.j.w.WebAppContext] (main:null) Failed
startup of context
o.e.j.w.WebAppContext@78a2da20{/client,file:///usr/share/cloudstack-management/webapp/,UNAVAILABLE}{/usr/share/cloudstack-management/webapp}
org.springframework.context.ApplicationContextException: Failed to start bean
'cloudStackLifeCycle'; nested exception is
com.cloud.utils.exception.CloudRuntimeException: Failed to inject generated
public key into systemvm iso mount: could not find any free loop device
And sure enough, all /dev/loopX are in use:
# losetup -a
/dev/loop0: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
/dev/loop1: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
/dev/loop2: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
/dev/loop3: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
/dev/loop4: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
/dev/loop5: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
/dev/loop6: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
/dev/loop7: [ca06]:1315130 (/usr/share/cloudstack-common/vms/systemvm.iso)
#
Recent changes in the lab includes adding a VMware cluster, registered new
systemvm templates for VMware, and created our own template for VMware.
It looks like the updateKeyPairs process runs once an hour, and it failed to
clean up the loopback device. So, in about 8 hours, the management server
would run out loopback devices and dies.
Any suggestions how I troubleshoot this further?
Thanks
Yiping