GitHub user Lauta11 created a discussion: Timeout snapshot

### problem

When creating snapshots of large volumes (approximately larger than 200 GB), 
the task fails to complete successfully. The snapshot command reaches a timeout 
of 3600 seconds, which is not enough time to copy the entire disk.

We attempted to modify various timeout-related parameters in CloudStack 
(including those for snapshot and asynchronous tasks), but none of them appear 
to extend or affect this timeout limit.

It is unclear whether this is a bug or a missing feature that should allow 
increasing the timeout duration for snapshot operations.

The corresponding logs are attached for further analysis.
```
Management:

2025-10-21 02:01:57,863 DEBUG [c.c.s.s.SnapshotSchedulerImpl] 
(SnapshotPollTask:ctx-93846872) (logid:f246f83a) Snapshot 
[455b227d-797f-45e6-aa03-1dc7609ac030] for volume 
[{"name":"ROOT-392","uuid":"8508204f-24ba-454a-8bd1-f56b274088ae"}] can be 
executed.
2025-10-21 02:01:59,126 DEBUG [c.c.s.s.SnapshotSchedulerImpl] 
(SnapshotPollTask:ctx-93846872) (logid:f246f83a) Scheduling snapshot 
[455b227d-797f-45e6-aa03-1dc7609ac030] for volume 
[{"name":"ROOT-392","uuid":"8508204f-24ba-454a-8bd1-f56b274088ae"}] at 
[2025-10-21 05:00:00 GMT].
2025-10-21 02:01:59,179 DEBUG [c.c.s.s.SnapshotSchedulerImpl] 
(SnapshotPollTask:ctx-93846872) (logid:f246f83a) Scheduled snapshot 
[455b227d-797f-45e6-aa03-1dc7609ac030] for volume 
[{"name":"ROOT-392","uuid":"8508204f-24ba-454a-8bd1-f56b274088ae"}] as job 
[d0e089bb-d2cb-4e81-8fb4-5e6a23eee556].
2025-10-21 02:02:01,607 DEBUG [o.a.c.s.s.StorPoolSnapshotStrategy] 
(Work-Job-Executor-28:ctx-a246b76c job-33232/job-33233 ctx-6f646c29) 
(logid:d0e089bb) StorpoolSnapshotStrategy.canHandle: 
snapshot=server18063_ROOT-392_20251021050158, 
uuid=70decb5c-6f4d-405a-b7e1-f2eb2ace4de0, op=TAKE



2025-10-21 03:02:01,993 DEBUG [o.a.c.s.s.SnapshotServiceImpl] 
(Work-Job-Executor-28:ctx-a246b76c job-33232/job-33233 ctx-6f646c29) 
(logid:d0e089bb) create snapshot server18063_ROOT-392_20251021050158 failed: 
com.cloud.utils.exception.CloudRuntimeException: Failed to send command, due to 
Agent:59, com.cloud.exception.OperationTimedoutException: Commands 
545217029888601412 to Host 59 timed out after 3600


Host:


2025-10-21 03:06:26,176 WARN  [utils.script.Script] (Script-10:null) (logid:) 
Interrupting script.
2025-10-21 03:06:26,176 WARN  [utils.script.Script] 
(agentRequest-Handler-2:null) (logid:d0e089bb) Process [3934352] for command 
[qemu-img convert -O qcow2 -U --imag
e-opts 
driver=qcow2,file.filename=/mnt/4c524bab-25a3-3db5-a665-d37159b81f11/11806790-0ad0-4a4a-b389-2f7ab41b4e87
 /mnt/4c524bab-25a3-3db5-a665-d37159b81f11/snapshots/
3ae09c67-c072-48b4-a8a7-e3f2dbaf2687 ] timed out. Output is [].
2025-10-21 03:06:43,053 ERROR [kvm.storage.KVMStorageProcessor] 
(agentRequest-Handler-2:null) (logid:d0e089bb) Failed take snapshot for volume 
[volumeTO[uuid=8508204
f-24ba-454a-8bd1-f56b274088ae|path=11806790-0ad0-4a4a-b389-2f7ab41b4e87|datastore=PrimaryDataStoreTO[uuid=4c524bab-25a3-3db5-a665-d37159b81f11|name=Pool-01|id=18|poo
ltype=NetworkFilesystem]]], in VM [i-52-392-VM], due to [Failed to convert 
volumeTO[uuid=8508204f-24ba-454a-8bd1-f56b274088ae|path=11806790-0ad0-4a4a-b389-2f7ab41b4e
87|datastore=PrimaryDataStoreTO[uuid=4c524bab-25a3-3db5-a665-d37159b81f11|name=Pool-01|id=18|pooltype=NetworkFilesystem]]
 snapshot of volume [KVMPhysicalDisk {"dispN
ame":null,"format":"qcow2","name":"11806790-0ad0-4a4a-b389-2f7ab41b4e87","path":"\/mnt\/4c524bab-25a3-3db5-a665-d37159b81f11\/11806790-0ad0-4a4a-b389-2f7ab41b4e87","
pool":{"uuid":"4c524bab-25a3-3db5-a665-d37159b81f11","path":"\/mnt\/4c524bab-25a3-3db5-a665-d37159b81f11"},"size":122363461632,"virtualSize":161061273600,"vmName":nu
ll}] to 
[/mnt/4c524bab-25a3-3db5-a665-d37159b81f11/snapshots/3ae09c67-c072-48b4-a8a7-e3f2dbaf2687]
 due to [timeout].].
com.cloud.utils.exception.CloudRuntimeException: Failed to convert 
volumeTO[uuid=8508204f-24ba-454a-8bd1-f56b274088ae|path=11806790-0ad0-4a4a-b389-2f7ab41b4e87|datas
tore=PrimaryDataStoreTO[uuid=4c524bab-25a3-3db5-a665-d37159b81f11|name=Pool-01|id=18|pooltype=NetworkFilesystem]]
 snapshot of volume [KVMPhysicalDisk 
{"dispName":null,"format":"qcow2","name":"11806790-0ad0-4a4a-b389-2f7ab41b4e87","path":"\/mnt\/4c524bab-25a3-3db5-a665-d37159b81f11\/11806790-0ad0-4a4a-b389-2f7ab41b4e87","pool":{"uuid":"4c524bab-25a3-3db5-a665-d37159b81f11","path":"\/mnt\/4c524bab-25a3-3db5-a665-d37159b81f11"},"size":122363461632,"virtualSize":161061273600,"vmName":null}]
 to 
[/mnt/4c524bab-25a3-3db5-a665-d37159b81f11/snapshots/3ae09c67-c072-48b4-a8a7-e3f2dbaf2687]
 due to [timeout].
        at 
com.cloud.hypervisor.kvm.storage.KVMStorageProcessor.validateConvertResult(KVMStorageProcessor.java:1915)
        at 
com.cloud.hypervisor.kvm.storage.KVMStorageProcessor.createSnapshot(KVMStorageProcessor.java:1790)
        at 
com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.execute(StorageSubsystemCommandHandlerBase.java:140)
        at 
com.cloud.storage.resource.StorageSubsystemCommandHandlerBase.handleStorageCommands(StorageSubsystemCommandHandlerBase.java:66)
        at 
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtStorageSubSystemCommandWrapper.execute(LibvirtStorageSubSystemCommandWrapper.java:36)
        at 
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtStorageSubSystemCommandWrapper.execute(LibvirtStorageSubSystemCommandWrapper.java:30)
        at 
com.cloud.hypervisor.kvm.resource.wrapper.LibvirtRequestWrapper.execute(LibvirtRequestWrapper.java:78)
        at 
com.cloud.hypervisor.kvm.resource.LibvirtComputingResource.executeRequest(LibvirtComputingResource.java:1930)
        at com.cloud.agent.Agent.processRequest(Agent.java:683)
        at com.cloud.agent.Agent$AgentRequestHandler.doTask(Agent.java:1106)
        at com.cloud.utils.nio.Task.call(Task.java:83)
        at com.cloud.utils.nio.Task.call(Task.java:29)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at 
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)

### versions

Cloudstack v4.20 / KVM
S.O ubuntu 24


### The steps to reproduce the bug

1. Start a snapshot task for a volume > 200GB
2. Wait one hour
3. Error


### What to do about it?

Complete configuration to be able to modify this time or fix a bug if necessary.

GitHub link: https://github.com/apache/cloudstack/discussions/11923

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to