[Qemu-devel] [Bug 1810603] Re: QEMU QCow Images grow dramatically

Kevin Wolf Fri, 11 Jan 2019 05:23:00 -0800

Regarding snapshot deletion, QEMU does punch holes into the image file
when deleting snapshots, so the space should effectively be freed, even
if this isn't visible in the file size. To get actually meaningful
numbers, you'd have to look at the allocated blocks rather than the file
size (e.g. by using 'du -h' instead of 'ls -lh').


A quick test with internal snapshots didn't show any increase of the
used space for me. So the condition to trigger the problem must be a
little more complicated than just saving, loading and deleting some
snapshots. After some random manual testing, I wrote a small script to
share the results with you:

#!/bin/bash
  
./qemu-img create -f qcow2 /tmp/test.qcow2 64M
./qemu-io -c 'write 16M 32M' /tmp/test.qcow2
du /tmp/test.qcow2
./qemu-img snapshot -c snap0 /tmp/test.qcow2
./qemu-io -c 'write 0 32M' /tmp/test.qcow2
./qemu-img snapshot -c snap1 /tmp/test.qcow2
./qemu-io -c 'write 32M 32M' /tmp/test.qcow2
./qemu-img snapshot -a snap0 /tmp/test.qcow2
./qemu-img snapshot -d snap0 /tmp/test.qcow2
./qemu-img snapshot -d snap1 /tmp/test.qcow2
du /tmp/test.qcow2

The result of this script is that both 'du' invocations show the same
value for me, i.e. after taking two snapshots and making the image fully
allocated, reverting to the first snapshot and deleting the snapshots
gets the image back to the original 32 MB:

Formatting '/tmp/test.qcow2', fmt=qcow2 size=67108864 cluster_size=65536 
lazy_refcounts=off refcount_bits=16
wrote 33554432/33554432 bytes at offset 16777216
32 MiB, 1 ops; 0.0088 sec (3.520 GiB/sec and 112.6380 ops/sec)
33028   /tmp/test.qcow2
wrote 33554432/33554432 bytes at offset 0
32 MiB, 1 ops; 0.0083 sec (3.739 GiB/sec and 119.6602 ops/sec)
wrote 33554432/33554432 bytes at offset 33554432
32 MiB, 1 ops; 0.0082 sec (3.785 GiB/sec and 121.1094 ops/sec)
33028   /tmp/test.qcow2

Maybe you can play a bit more with qemu-img and qemu-io and find a case
where the image grows like you see on your VM? Once we got a reproducer,
we can try and check where the growth comes from.

-- 
You received this bug notification because you are a member of qemu-
devel-ml, which is subscribed to QEMU.
https://bugs.launchpad.net/bugs/1810603

Title:
  QEMU QCow Images grow dramatically

Status in QEMU:
  New

Bug description:
  I've recently migrated our VM infrastructure (~200 guest on 15 hosts)
  from vbox to Qemu (using KVM / libvirt). We have a master image (QEMU
  QCow v3) from which we spawn multiple instances (linked clones). All
  guests are being revert once per hour for security reasons.

  About 2 weeks after we successfully migrated to Qemu, we noticed that
  almost all disks went full across all 15 hosts. Our investigation
  showed that the initial qcow disk images blow up from a few gigabytes
  to 100GB and more. This should not happen, as we revert all VMs back
  to the initial snapshot once per hour and hence all changes that have
  been made to disks must be reverted too.

  We did an addition test with 24 hour time frame with which we could
  reproduce this bug as documented below.

  Initial disk image size (created on Jan 04):
  -rw-r--r-- 1 root root 7.1G Jan  4 15:59 W10-TS01-0.img
  -rw-r--r-- 1 root root 7.3G Jan  4 15:59 W10-TS02-0.img
  -rw-r--r-- 1 root root 7.4G Jan  4 15:59 W10-TS03-0.img
  -rw-r--r-- 1 root root 8.3G Jan  4 16:02 W10-CLIENT01-0.img
  -rw-r--r-- 1 root root 8.6G Jan  4 16:05 W10-CLIENT02-0.img
  -rw-r--r-- 1 root root 8.0G Jan  4 16:05 W10-CLIENT03-0.img
  -rw-r--r-- 1 root root 8.3G Jan  4 16:08 W10-CLIENT04-0.img
  -rw-r--r-- 1 root root 8.1G Jan  4 16:12 W10-CLIENT05-0.img
  -rw-r--r-- 1 root root 8.0G Jan  4 16:12 W10-CLIENT06-0.img
  -rw-r--r-- 1 root root 8.1G Jan  4 16:16 W10-CLIENT07-0.img
  -rw-r--r-- 1 root root 7.6G Jan  4 16:16 W10-CLIENT08-0.img
  -rw-r--r-- 1 root root 7.6G Jan  4 16:19 W10-CLIENT09-0.img
  -rw-r--r-- 1 root root 7.5G Jan  4 16:21 W10-ROUTER-0.img
  -rw-r--r-- 1 root root  18G Jan  4 16:25 W10-MASTER-IMG.qcow2

  Disk image size after 24 hours (printed on Jan 05):
  -rw-r--r-- 1 root root  13G Jan  5 15:07 W10-TS01-0.img
  -rw-r--r-- 1 root root 8.9G Jan  5 14:20 W10-TS02-0.img
  -rw-r--r-- 1 root root 9.0G Jan  5 15:07 W10-TS03-0.img
  -rw-r--r-- 1 root root  10G Jan  5 15:08 W10-CLIENT01-0.img
  -rw-r--r-- 1 root root  11G Jan  5 15:08 W10-CLIENT02-0.img
  -rw-r--r-- 1 root root  11G Jan  5 15:08 W10-CLIENT03-0.img
  -rw-r--r-- 1 root root  11G Jan  5 15:08 W10-CLIENT04-0.img
  -rw-r--r-- 1 root root  19G Jan  5 15:07 W10-CLIENT05-0.img
  -rw-r--r-- 1 root root  14G Jan  5 15:08 W10-CLIENT06-0.img
  -rw-r--r-- 1 root root 9.7G Jan  5 15:07 W10-CLIENT07-0.img
  -rw-r--r-- 1 root root  35G Jan  5 15:08 W10-CLIENT08-0.img
  -rw-r--r-- 1 root root 9.2G Jan  5 15:07 W10-CLIENT09-0.img
  -rw-r--r-- 1 root root  41G Jan  5 15:08 W10-ROUTER-0.img
  -rw-r--r-- 1 root root  18G Jan  4 16:25 W10-MASTER-IMG.qcow2

  You can reproduce this bug as follow:
  1) create an initial disk image
  2) create a linked clone
  3) create a snapshot of the linked clone
  4) revert the snapshot every X minutes / hours

  Due the described behavior / bug, our VM farm is completely down at
  the moment (as we run out of disk space on all host systems). A quick
  fix for this bug would be much appreciated.

  Host OS: Ubuntu 18.04.01 LTS
  Kernel: 4.15.0-43-generic
  Qemu: 3.1.0
  libvirt: 4.10.0
  Guest OS: Windows 10 64bit

To manage notifications about this bug go to:
https://bugs.launchpad.net/qemu/+bug/1810603/+subscriptions

[Qemu-devel] [Bug 1810603] Re: QEMU QCow Images grow dramatically

Reply via email to