Hi!

I am facing some issues with the cloudstack csi driver (leaseweb fork). In 
general it works pretty good, but for example when draining a Kubernetes node 
which triggers a lot of detach, attach operations, randomly something goes 
wrong and i end up in a inconsistent state and i cant attach devices anymore to 
the affected instance.

Scenario…


  *   Instance a has a few block volumes, requested by the CSI driver. Vda, 
vdb, vdc, vdd, vde show up in the libvirt xml
  *   Vdd gets detached from instance a
  *   Instance a now has vda, vdb, vdc, vde in its libvirt xml
  *   CSI driver requests a new block volume for instance a, and tries to 
attach it as vde, instead of using the meanwhile became free vdd

From that point on, no more devices can be attached tot he instance. The 
management server shows this

2025-10-01 11:00:52,702 ERROR [c.c.a.ApiAsyncJobDispatcher] 
(API-Job-Executor-85:[ctx-ee10aa59, job-629270]) (logid:5018a3b3) Unexpected 
exception while executing 
org.apache.cloudstack.api.command.user.volume.AttachVolumeCmd 
com.cloud.utils.exception.CloudRuntimeException: Failed to attach volume 
pvc-xxxx-b45f-4324-a85b-xxxx to VM kubetest-1; org.libvirt.LibvirtException: 
XML error: target 'vde' duplicated for disk sources 
'/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx and 
'/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx

2025-10-01 11:00:52,702 DEBUG [o.a.c.f.j.i.AsyncJobManagerImpl] 
(API-Job-Executor-85:[ctx-ee10aa59, job-629270]) (logid:5018a3b3) Complete 
async job-629270, jobStatus: FAILED, resultCode: 530, result: 
org.apache.cloudstack.api.response.ExceptionResponse/null/{"uuidList":[],"errorcode":"530","errortext":"Failed
 to attach volume pvc-xxxx-b45f-4324-a85b-xxxx to VM kubetest -1; 
org.libvirt.LibvirtException: XML error: target 'vde' duplicated for disk 
sources /mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx ' and 
'/mnt/xxxx-387c-3f14-aea7-0d19104d92dd/xxxx-c659-4699-8885-xxxx "}

If acs would try to add a new vdd interface (which became free) things would 
work i guess. After a shutdown/reboot of the affected vm, everything starts 
working again and new block devices can be attached.

We are currently on acs 4.20.1.0 on Ubuntu 24.04

Cheers,

Juergen

Reply via email to