Kishan Kavala created CLOUDSTACK-4175: -----------------------------------------
Summary: [KVM] [Upgrade from 2.2.14] Failed to delete and cleanup ROOT / DATA volumes from primary storage that were created prior to upgrade and marked for cleanup Key: CLOUDSTACK-4175 URL: https://issues.apache.org/jira/browse/CLOUDSTACK-4175 Project: CloudStack Issue Type: Bug Security Level: Public (Anyone can view this level - this is the default.) Reporter: Kishan Kavala Steps to reproduce : 1. Have 2.2.14 setup with KVM host and at least few vms deployed with ROOT and DATA disks 2. Upgrade to 4.2 3. make sure storage.cleanup.interval is set to 300 4. Detach and Delete DATA / ROOT disks that were created before upgrade Observations : (i) It successfully deleted without any issues (ii) When storage GC is kicked in, it failed to cleanup those volumes because these entities are still using path like "/mnt/a47b2aa3-ca42-3bb1-926d-06e8eacfedde/4afdadd3-1ef0-4a7a-9caa-d61e63dfe355" but, the new volumes created are just using path like "4afdadd3-1ef0-4a7a-9caa-d61e63dfe355" Here is the snippet for DATA-7 volume which was created before upgrade on KVM Before delete: ========== mysql> select * from volumes where name like '%data-7%'\G *************************** 1. row *************************** id: 8 account_id: 2 domain_id: 1 pool_id: 200 instance_id: 7 device_id: 1 name: data-7 size: 5368709120 folder: /export/home/talluri/byronp path: /mnt/a47b2aa3-ca42-3bb1-926d-06e8eacfedde/4afdadd3-1ef0-4a7a-9caa-d61e63dfe355 pod_id: 1 data_center_id: 1 iscsi_name: NULL host_ip: NULL volume_type: DATADISK pool_type: NetworkFilesystem disk_offering_id: 3 template_id: NULL iso_id: NULL first_snapshot_backup_uuid: NULL recreatable: 0 created: 2012-12-21 06:26:10 attached: 2012-12-21 06:26:26 updated: 2012-12-21 06:26:26 removed: NULL state: Ready chain_info: NULL uuid: 8 last_pool_id: 200 update_count: 0 1 row in set (0.00 sec) After delete : ========== mysql> select * from volumes where name like '%data-7%'\G *************************** 1. row *************************** id: 8 account_id: 2 domain_id: 1 pool_id: 200 instance_id: NULL device_id: NULL name: data-7 size: 5368709120 folder: /export/home/talluri/byronp path: /mnt/a47b2aa3-ca42-3bb1-926d-06e8eacfedde/4afdadd3-1ef0-4a7a-9caa-d61e63dfe355 pod_id: 1 data_center_id: 1 iscsi_name: NULL host_ip: NULL volume_type: DATADISK pool_type: NetworkFilesystem disk_offering_id: 3 template_id: NULL iso_id: NULL first_snapshot_backup_uuid: NULL recreatable: 0 created: 2012-12-21 06:26:10 attached: NULL updated: 2012-12-21 15:44:48 removed: NULL state: Destroy chain_info: NULL uuid: 8 last_pool_id: 200 update_count: 2 1 row in set (0.00 sec) We can see that this volume still exists on primary storage ========================================== [root@Rhel62-Sanjeev byronp]# ls | grep -i 4afdadd3-1ef0-4a7a-9caa-d61e63dfe355 4afdadd3-1ef0-4a7a-9caa-d61e63dfe355 Here is the snippet from GC log ======================= 2012-12-21 21:14:39,651 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-83:null) Seq 6-847643703: Response Received: 2012-12-21 21:14:39,652 DEBUG [agent.transport.Request] (StatsCollector-3:null) Seq 6-847643703: Received: { Ans: , MgmtId: 6861512245319, via: 6, Ver: v1, Flags: 10, { GetHostStatsAnswer } } 2012-12-21 21:14:39,685 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-313:null) Seq 7-1359414333: Executing request 2012-12-21 21:14:39,820 DEBUG [agent.manager.DirectAgentAttache] (DirectAgent-313:null) Seq 7-1359414333: Response Received: 2012-12-21 21:14:39,821 DEBUG [agent.transport.Request] (StatsCollector-3:null) Seq 7-1359414333: Received: { Ans: , MgmtId: 6861512245319, via: 7, Ver: v1, Flags: 10, { GetHostStatsAnswer } } 2012-12-21 21:14:48,100 DEBUG [cloud.storage.StorageManagerImpl] (catalina-exec-21:null) Expunging Vol[8|vm=null|DATADISK] 2012-12-21 21:14:48,106 DEBUG [agent.transport.Request] (catalina-exec-21:null) Seq 1-2138047787: Sending { Cmd , MgmtId: 6861512245319, via: 1, Ver: v1, Flags: 100111, [{"storage.DestroyCommand":{"volume":{"id":8,"name":"data-7","mountPoint":"/export/home/talluri/byronp","path":"/mnt/a47b2aa3-ca42-3bb1-926d-06e8eacfedde/4afdadd3-1ef0-4a7a-9caa-d61e63dfe355","size":5368709120,"type":"DATADISK","storagePoolType":"NetworkFilesystem","storagePoolUuid":"a47b2aa3-ca42-3bb1-926d-06e8eacfedde","deviceId":0},"wait":0}}] } 2012-12-21 21:14:48,264 DEBUG [agent.transport.Request] (AgentManager-Handler-9:null) Seq 1-2138047787: Processing: { Ans: , MgmtId: 6861512245319, via: 1, Ver: v1, Flags: 110, [{"Answer":{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException: org.libvirt.LibvirtException: Storage volume not found: no storage vol with matching name '/mnt/a47b2aa3-ca42-3bb1-926d-06e8eacfedde/4afdadd3-1ef0-4a7a-9caa-d61e63dfe355'","wait":0}}] } 2012-12-21 21:14:48,264 DEBUG [agent.manager.AgentAttache] (AgentManager-Handler-9:null) Seq 1-2138047787: No more commands found 2012-12-21 21:14:48,264 DEBUG [agent.transport.Request] (catalina-exec-21:null) Seq 1-2138047787: Received: { Ans: , MgmtId: 6861512245319, via: 1, Ver: v1, Flags: 110, { Answer } } 2012-12-21 21:14:48,264 INFO [cloud.storage.StorageManagerImpl] (catalina-exec-21:null) Will retry delete of Vol[8|vm=null|DATADISK] from 200 2012-12-21 21:14:55,706 DEBUG [storage.secondary.SecondaryStorageManagerImpl] (secstorage-1:null) Zone 1 is ready to launch secondary storage VM 2012-12-21 21:14:56,030 DEBUG [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) Zone 1 is ready to launch console proxy 2012-12-21 21:14:57,113 DEBUG [network.router.VirtualNetworkApplianceManagerImpl] (RouterStatusMonitor-1:null) Found 0 routers to update status. 2012-12-21 21:14:57,114 DEBUG [network.router.VirtualNetworkApplianceManagerImpl] (RouterStatusMonitor-1:null) Found 0 networks to update RvR status. 2012-12-21 21:14:59,447 DEBUG [cloud.server.StatsCollector] (StatsCollector-1:null) StorageCollector is running... 2012-12-21 21:14:59,499 DEBUG [agent.transport.Request] (StatsCollector-1:null) Seq 8-1817051998: Received: { Ans: , MgmtId: 6861512245319, via: 8, Ver: v1, Flags: 10, { GetStorageStatsAnswer } } 2012-12-21 21:14:59,519 DEBUG [agent.transport.Request] (StatsCollector-1:null) Seq 1-2138047788: Received: { Ans: , As part of upgrade, we used sysvmadm script to destroy SSVM and reboot router which resulted in existing ROOT disk destroy in order to use the new systemvm template. GC throwing errors for those as well. ======================================================================================================================================================= 2012-12-21 20:22:26,030 DEBUG [cloud.consoleproxy.ConsoleProxyManagerImpl] (consoleproxy-1:null) Zone 1 is ready to launch console proxy 2012-12-21 20:22:27,113 DEBUG [network.router.VirtualNetworkApplianceManagerImpl] (RouterStatusMonitor-1:null) Found 0 routers to update status. 2012-12-21 20:22:27,114 DEBUG [network.router.VirtualNetworkApplianceManagerImpl] (RouterStatusMonitor-1:null) Found 0 networks to update RvR status. 2012-12-21 20:22:53,127 DEBUG [cloud.storage.StorageManagerImpl] (StorageManager-Scavenger-1:null) Storage pool garbage collector found 0 templates to clean up in storage pool: p 2012-12-21 20:22:53,131 DEBUG [cloud.storage.StorageManagerImpl] (StorageManager-Scavenger-1:null) Storage pool garbage collector found 0 templates to clean up in storage pool: pri_esx_2214 2012-12-21 20:22:53,135 DEBUG [cloud.storage.StorageManagerImpl] (StorageManager-Scavenger-1:null) Storage pool garbage collector found 0 templates to clean up in storage pool: p-xen 2012-12-21 20:22:53,138 DEBUG [cloud.storage.StorageManagerImpl] (StorageManager-Scavenger-1:null) Secondary storage garbage collector found 0 templates to cleanup on secondary storage host: nfs://10.147.28.7/export/home/talluri/byrons 2012-12-21 20:22:53,138 DEBUG [cloud.storage.StorageManagerImpl] (StorageManager-Scavenger-1:null) Orphan snapshot is disabled 2012-12-21 20:22:53,151 DEBUG [cloud.storage.StorageManagerImpl] (StorageManager-Scavenger-1:null) Secondary storage garbage collector found 0 templates to cleanup on secondary storage host: nfs://10.147.28.7/export/home/talluri/byrons 2012-12-21 20:22:53,153 DEBUG [cloud.storage.StorageManagerImpl] (StorageManager-Scavenger-1:null) Expunging Vol[2|vm=2|ROOT] 2012-12-21 20:22:53,160 DEBUG [agent.transport.Request] (StorageManager-Scavenger-1:null) Seq 1-2138047604: Sending { Cmd , MgmtId: 6861512245319, via: 1, Ver: v1, Flags: 100111, [{"storage.DestroyCommand":{"vmName":"v-2-VM","volume":{"id":2,"name":"ROOT-2","mountPoint":"/export/home/talluri/byronp","path":"/mnt/a47b2aa3-ca42-3bb1-926d-06e8eacfedde/e129fc25-8d5d-41b3-9882-120fb5e79955","size":2097152000,"type":"ROOT","storagePoolType":"NetworkFilesystem","storagePoolUuid":"a47b2aa3-ca42-3bb1-926d-06e8eacfedde","deviceId":0},"wait":0}}] } 2012-12-21 20:22:53,316 DEBUG [agent.transport.Request] (AgentManager-Handler-10:null) Seq 1-2138047604: Processing: { Ans: , MgmtId: 6861512245319, via: 1, Ver: v1, Flags: 110, [{"Answer":{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException: org.libvirt.LibvirtException: Storage volume not found: no storage vol with matching name '/mnt/a47b2aa3-ca42-3bb1-926d-06e8eacfedde/e129fc25-8d5d-41b3-9882-120fb5e79955'","wait":0}}] } 2012-12-21 20:22:53,317 DEBUG [agent.manager.AgentAttache] (AgentManager-Handler-10:null) Seq 1-2138047604: No more commands found 2012-12-21 20:22:53,317 DEBUG [agent.transport.Request] (StorageManager-Scavenger-1:null) Seq 1-2138047604: Received: { Ans: , MgmtId: 6861512245319, via: 1, Ver: v1, Flags: 110, { Answer } } 2012-12-21 20:22:53,317 INFO [cloud.storage.StorageManagerImpl] (StorageManager-Scavenger-1:null) Will retry delete of Vol[2|vm=2|ROOT] from 200 2012-12-21 20:22:53,332 DEBUG [cloud.storage.StorageManagerImpl] (StorageManager-Scavenger-1:null) Expunging Vol[3|vm=3|ROOT] 2012-12-21 20:22:53,339 DEBUG [agent.transport.Request] (StorageManager-Scavenger-1:null) Seq 1-2138047605: Sending { Cmd , MgmtId: 6861512245319, via: 1, Ver: v1, Flags: 100111, [{"storage.DestroyCommand":{"vmName":"s-3-VM","volume":{"id":3,"name":"ROOT-3","mountPoint":"/export/home/talluri/byronp","path":"/mnt/a47b2aa3-ca42-3bb1-926d-06e8eacfedde/08e3972c-b223-44cc-833e-f21b3b280ef2","size":2097152000,"type":"ROOT","storagePoolType":"NetworkFilesystem","storagePoolUuid":"a47b2aa3-ca42-3bb1-926d-06e8eacfedde","deviceId":0},"wait":0}}] } 2012-12-21 20:22:53,492 DEBUG [agent.transport.Request] (AgentManager-Handler-9:null) Seq 1-2138047605: Processing: { Ans: , MgmtId: 6861512245319, via: 1, Ver: v1, Flags: 110, [{"Answer":{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException: org.libvirt.LibvirtException: Storage volume not found: no storage vol with matching name '/mnt/a47b2aa3-ca42-3bb1-926d-06e8eacfedde/08e3972c-b223-44cc-833e-f21b3b280ef2'","wait":0}}] } 2012-12-21 20:22:53,492 DEBUG [agent.manager.AgentAttache] (AgentManager-Handler-9:null) Seq 1-2138047605: No more commands found 2012-12-21 20:22:53,492 DEBUG [agent.transport.Request] (StorageManager-Scavenger-1:null) Seq 1-2138047605: Received: { Ans: , MgmtId: 6861512245319, via: 1, Ver: v1, Flags: 110, { Answer } } 2012-12-21 20:22:53,492 INFO [cloud.storage.StorageManagerImpl] (StorageManager-Scavenger-1:null) Will retry delete of Vol[3|vm=3|ROOT] from 200 2012-12-21 20:22:53,492 DEBUG [cloud.storage.StorageManagerImpl] (StorageManager-Scavenger-1:null) Expunging Vol[5|vm=5|ROOT] 2012-12-21 20:22:53,499 DEBUG [agent.transport.Request] (StorageManager-Scavenger-1:null) Seq 1-2138047606: Sending { Cmd , MgmtId: 6861512245319, via: 1, Ver: v1, Flags: 100111, [{"storage.DestroyCommand":{"vmName":"r-5-VM","volume":{"id":5,"name":"ROOT-5","mountPoint":"/export/home/talluri/byronp","path":"/mnt/a47b2aa3-ca42-3bb1-926d-06e8eacfedde/744e36f3-efcf-4177-bb29-bf89bccbcc22","size":2097152000,"type":"ROOT","storagePoolType":"NetworkFilesystem","storagePoolUuid":"a47b2aa3-ca42-3bb1-926d-06e8eacfedde","deviceId":0},"wait":0}}] } 2012-12-21 20:22:53,646 DEBUG [agent.transport.Request] (AgentManager-Handler-15:null) Seq 1-2138047606: Processing: { Ans: , MgmtId: 6861512245319, via: 1, Ver: v1, Flags: 110, [{"Answer":{"result":false,"details":"com.cloud.utils.exception.CloudRuntimeException: org.libvirt.LibvirtException: Storage volume not found: no storage vol with matching name '/mnt/a47b2aa3-ca42-3bb1-926d-06e8eacfedde/744e36f3-efcf-4177-bb29-bf89bccbcc22'","wait":0}}] } 2012-12-21 20:22:53,646 DEBUG [agent.manager.AgentAttache] (AgentManager-Handler-15:null) Seq 1-2138047606: No more commands found 2012-12-21 20:22:53,646 DEBUG [agent.transport.Request] (StorageManager-Scavenger-1:null) Seq 1-2138047606: Received: { Ans: , MgmtId: 6861512245319, via: 1, Ver: v1, Flags: 110, { Answer } } 2012-12-21 20:22:53,646 INFO [cloud.storage.StorageManagerImpl] (StorageManager-Scavenger-1:null) Will retry delete of Vol[5|vm=5|ROOT] from 200 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira