Re: [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation

Alberto Garcia Tue, 18 Apr 2017 04:53:44 -0700

On Thu 13 Apr 2017 05:17:21 PM CEST, Denis V. Lunev wrote:
> On 04/13/2017 06:04 PM, Alberto Garcia wrote:
>> On Thu 13 Apr 2017 03:30:43 PM CEST, Denis V. Lunev wrote:
>>> Yes, block size should be increased. I perfectly in agreement with
>>> your.  But I think that we could do that by plain increase of the
>>> cluster size without any further dances. Sub-clusters as sub-clusters
>>> will help if we are able to avoid COW. With COW I do not see much
>>> difference.
>> I'm trying to summarize your position, tell me if I got everything
>> correctly:
>>
>> 1. We should try to reduce data fragmentation on the qcow2 file,
>>    because it will have a long term effect on the I/O performance (as
>>    opposed to an effect on the initial operations on the empty image).
> yes
>
>> 2. The way to do that is to increase the cluster size (to 1MB or
>>    more).
> yes
>
>> 3. Benefit: increasing the cluster size also decreases the amount of
>>    metadata (L2 and refcount).
> yes
>
>> 4. Problem: L2 tables become too big and fill up the cache more
>>    easily. To solve this the cache code should do partial reads
>>    instead of complete L2 clusters.
> yes. We can read full cluster as originally if L2 cache is empty.
>
>> 5. Problem: larger cluster sizes also mean more data to copy when
>>    there's a COW. To solve this the COW code should be modified so it
>>    goes from 5 OPs (read head, write head, read tail, write tail,
>>    write data) to 2 OPs (read cluster, write modified cluster).
> yes, with small tweak if head and tail are in different clusters. In
> this case we
> will end up with 3 OPs.
>
>> 6. Having subclusters adds incompatible changes to the file format,
>>    and they offer no benefit after allocation.
> yes
>
>> 7. Subclusters are only really useful if they match the guest fs block
>>    size (because you would avoid doing COW on allocation). Otherwise
>>    the only thing that you get is a faster COW (because you move less
>>    data), but the improvement is not dramatic and it's better if we do
>>    what's proposed in point 5.
> yes
>
>> 8. Even if the subcluster size matches the guest block size, you'll
>>    get very fast initial allocation but also more chances to end up
>>    with a very fragmented qcow2 image, which is worse in the long run.
> yes
>
>> 9. Problem: larger clusters make a less efficient use of disk space,
>>    but that's a drawback you're fine with considering all of the
>>    above.
> yes
>
>> Is that a fair summary of what you're trying to say? Anything else
>> missing?
> yes.
>
> 5a. Problem: initial cluster allocation without COW. Could be made
>       cluster-size agnostic with the help of fallocate() call. Big
> clusters are even
>       better as the amount of such allocations is reduced.
>
> Thank you very much for this cool summary! I am too tongue-tied.


Hi Denis,

I don't have the have data to verify all your claims here, but in
general what you say makes sense.

Although I'm not sure if I agree with everything (especially on whether
any of this applies to SSD drives at all) it seems that we all agree
that the COW algorithm can be improved, so perhaps I should start by
taking a look at that.

Regards,

Berto

Re: [Qemu-devel] [RFC] Proposed qcow2 extension: subcluster allocation

Reply via email to