On 2014-11-10 at 22:12, Eric Blake wrote:
On 11/10/2014 06:45 AM, Max Reitz wrote:
qcow2_alloc_bytes() may reuse a cluster multiple times, in which case
the refcount is increased accordingly. However, if this would lead to an
overflow the function should instead just not reuse this cluster and
allocate a new one.
So if recount_order is 1 (2 bits per refcount, max refcount of 4

*max refcount of 3 (0b11)

), and
we encounter the same cluster 6 times (say by 5 back-to-back internal
snapshots), does this code optimize to only 2 clusters (both with
refcount 3) or does it result in each of the last 3 clusters spilling to
its own 1-ref cluster for a total of 4 clusters?  Short of Benoit's work
on deduplication, is there even a way to avoid inefficient use of
spilled clusters?

I'm not sure what you're referring to; maybe I should add that qcow2_alloc_bytes() is used for allocating compressed clusters (which ideally don't take up a full host cluster), so "reuse" in this context just means that several compressed clusters share one host cluster.

Maybe you're referring to the following situation: We have the default cluster size of 64k. Now we're trying to allocate 16k for each of the compressed clusters A, B, C and D. D won't fit into that cluster because the maximum refcount is three, so it will be put into a newly allocated host cluster. Finally, we're trying to allocate 32k for a compressed cluster E, which will then be put into the same cluster as D. We therefore have the following allocation (each sub-box representing 16k):

+---+---+---+---+   +---+---+---+---+
|A |B | C |   |   | D |   E | |
+---+---+---+---+   +---+---+---+---+

whereas the ideal allocation would be:

+---+---+---+---+   +---+---+---+---+
|A |B |   E   |   | C | D | | |
+---+---+---+---+   +---+---+---+---+

This is a problem, but I think first it's a minor one (just use a sufficiently large refcount width if you're going to use compressed clusters) and second it's about compressed clusters, whose performance I could hardly care less about, frankly.

Max

But I guess answering that can be a separate patch;
inefficiency is annoying, but not technically wrong and therefore not a
reason to reject this one.

Signed-off-by: Max Reitz <mre...@redhat.com>
---
  block/qcow2-refcount.c | 32 ++++++++++++++++++++++++++++++--
  1 file changed, 30 insertions(+), 2 deletions(-)

Reviewed-by: Eric Blake <ebl...@redhat.com>



Reply via email to