On 03/23/2014 08:01 PM, Andrei Mikhailovsky wrote:
Wido,
Could you please let me know when you've done this so I could try it out. Would
it be a part of the 4.3 branch or 4.4?
I'll do that. It will go into master which is 4.4 and I'm not sure if
this will be backported to 4.3.1
Wido
Thanks
----- Original Message -----
From: "Wido den Hollander" <w...@widodh.nl>
To: dev@cloudstack.apache.org
Sent: Sunday, 23 March, 2014 3:56:44 PM
Subject: Re: ACS and KVM uses /tmp for volumes migration and templates
On 03/21/2014 02:23 PM, Andrei Mikhailovsky wrote:
Wido,
i would be happy to try the custom ACS build unless 4.3 comes out soon. It has
been overdue for sometime now )). Has this feature been addressed in the 4.3
release?
No, it hasn't been fixed yet. I have to admit, I forgot about this until
you sent this e-mail to the list.
I'll fix this in master later this week.
I can leave with this feature for the time being, but i do see a longer term
issue when my volumes become large as i've only got about 100gb free space on
my host servers.
I fully agree. While writing this code I was aware of this. See my
comments in the code:
https://git-wip-us.apache.org/repos/asf?p=cloudstack.git;a=blob;f=plugins/hypervisors/kvm/src/com/cloud/hypervisor/kvm/storage/LibvirtStorageAdaptor.java;h=5de8bd26ae201187f5db5fd16b7e3ca157cab53a;hb=master#l1087
From what i can tell by looking at the rbd ls -l info all of my volumes are
done in Format 2
Correct, because I by-pass libvirt and Qemu at some places right now.
Cheers,
Andrei
----- Original Message -----
From: "Wido den Hollander" <w...@widodh.nl>
To: dev@cloudstack.apache.org
Sent: Thursday, 20 March, 2014 9:40:29 AM
Subject: Re: ACS and KVM uses /tmp for volumes migration and templates
On 03/20/2014 12:59 AM, Andrei Mikhailovsky wrote:
Hi guys,
I was wondering if this is a bug?
No, it's a "feature".
I've noticed that during volume migration from NFS to RBD primary storage the
volume image is first copied to /tmp and only then to the RBD storage. This
seems silly to me as one would expect a typical volume to be larger than the
host's hard disk. Also, it is a common practice to use tmpfs as /tmp for
performance reasons. Thus, a typical host server will have far smaller /tmp
folder than the size of an average volume. As a result, volume migration would
break after filling the /tmp and could probably cause a bunch of issue for the
KVM host itself as well as any vms running on the server.
Correct. The problem was that RBD images know two formats. Format 1
(old/legacy) and format 2.
In order to perform cloning images should be in RBD format 2.
When running qemu-img convert with a RBD image as a destination qemu-img
will create a RBD image in format 1.
That's due to this piece of code in block/rbd.c in Qemu:
ret = rbd_create(io_ctx, name, bytes, &obj_order);
rbd_create() creates images in format 1. To use format 2 you should use
rbd_create2() or rbd_create3().
With RBD format 1 we can't do snapshotting or cloning, which we require
in ACS.
So I had to do a intermediate step where I first wrote the RAW image
somewhere and afterwards write it to RBD.
After some discussion a config option has been added to Ceph:
OPTION(rbd_default_format, OPT_INT, 1)
This allows me to do this:
qemu-img convert .. -O raw .. rbd:rbd/myimage:rbd_default_format=2
This causes librbd/RBD to create a format 2 image and we can skip the
convert step to /tmp.
This option is available since Ceph Dumpling 0.67.5 and was not
available when ACS 4.2 was written.
I'm going to make changes in master which skip the step with /tmp.
Technically this can be backported to 4.2, but then you would have to
run your own homebrew version of 4.2
It also seems that the /tmp is temporarily used during a template creation .
Same story as above.
My setup:
ACS 4.2.1
Ubuntu 12.04 with KVM
RBD + NFS for Primary storage
NFS for Staging and Secondary storage
Thanks
Andrei