On 30.01.2017 17:14, Alberto Garcia wrote: > The metadata overlap checks introduced in a40f1c2add help detect > corruption in the qcow2 image by verifying that data writes don't > overlap with existing metadata sections. > > The 'refcount-block' check in particular iterates over the refcount > table in order to get the addresses of all refcount blocks and check > that none of them overlap with the region where we want to write. > > The problem with the refcount table is that since it always occupies > complete clusters its size is usually very big.
Actually the main problem is that BDRVQcow2State.refcount_table_size is updated very generously as opposed to BDRVQcow2State.l1_size. The latter is usually exactly as large as it needs to be (because the L1 table size usually doesn't change), whereas the refcount_table_size is just bumped up every time the image gets bigger until it reaches or exceeds the value it needs to be. > With the default > values of cluster_size=64KB and refcount_bits=16 this table holds 8192 > entries, each one of them enough to map 2GB worth of host clusters. > > So unless we're using images with several TB of allocated data this > table is going to be mostly empty, and iterating over it is a waste of > CPU. If the storage backend is fast enough this can have an effect on > I/O performance. > > This patch keeps the index of the last used (i.e. non-zero) entry in > the refcount table and updates it every time the table changes. The > refcount-block overlap check then uses that index instead of reading > the whole table. > > In my tests with a 4GB qcow2 file stored in RAM this doubles the > amount of write IOPS. > > Signed-off-by: Alberto Garcia <be...@igalia.com> > --- > block/qcow2-refcount.c | 21 ++++++++++++++++++++- > block/qcow2.c | 1 + > block/qcow2.h | 1 + > 3 files changed, 22 insertions(+), 1 deletion(-) > > diff --git a/block/qcow2-refcount.c b/block/qcow2-refcount.c > index cbfb3fe064..5e4d36587f 100644 > --- a/block/qcow2-refcount.c > +++ b/block/qcow2-refcount.c > @@ -83,6 +83,16 @@ static Qcow2SetRefcountFunc *const set_refcount_funcs[] = { > /*********************************************************/ > /* refcount handling */ > > +static void update_max_refcount_table_index(BDRVQcow2State *s) > +{ > + unsigned i = s->refcount_table_size - 1; > + while (i > 0 && (s->refcount_table[i] & REFT_OFFSET_MASK) == 0) { > + i--; > + } > + /* Set s->max_refcount_table_index to the index of the last used entry */ > + s->max_refcount_table_index = i; > +} > + > int qcow2_refcount_init(BlockDriverState *bs) > { > BDRVQcow2State *s = bs->opaque; > @@ -111,6 +121,7 @@ int qcow2_refcount_init(BlockDriverState *bs) > } > for(i = 0; i < s->refcount_table_size; i++) > be64_to_cpus(&s->refcount_table[i]); > + update_max_refcount_table_index(s); > } > return 0; > fail: > @@ -439,6 +450,7 @@ static int alloc_refcount_block(BlockDriverState *bs, > } > > s->refcount_table[refcount_table_index] = new_block; > + s->max_refcount_table_index = refcount_table_index; Should be MAX(s->max_refcount_table_index, refcount_table_index) or this won't support refcount tables with holes in them. Apart from this, the patch looks good to me. (Just a nagging comment below.) > > /* The new refcount block may be where the caller intended to put its > * data, so let it restart the search. */ > @@ -580,6 +592,7 @@ static int alloc_refcount_block(BlockDriverState *bs, > s->refcount_table = new_table; > s->refcount_table_size = table_size; > s->refcount_table_offset = table_offset; > + update_max_refcount_table_index(s); > > /* Free old table. */ > qcow2_free_clusters(bs, old_table_offset, old_table_size * > sizeof(uint64_t), > @@ -2171,6 +2184,7 @@ write_refblocks: > s->refcount_table = on_disk_reftable; > s->refcount_table_offset = reftable_offset; > s->refcount_table_size = reftable_size; > + update_max_refcount_table_index(s); > > return 0; > > @@ -2383,7 +2397,11 @@ int qcow2_check_metadata_overlap(BlockDriverState *bs, > int ign, int64_t offset, > } > > if ((chk & QCOW2_OL_REFCOUNT_BLOCK) && s->refcount_table) { > - for (i = 0; i < s->refcount_table_size; i++) { > + unsigned last_entry = s->max_refcount_table_index; > + assert(last_entry < s->refcount_table_size); > + assert(last_entry + 1 == s->refcount_table_size || > + (s->refcount_table[last_entry + 1] & REFT_OFFSET_MASK) == 0); I'm not sure we need this second assertion, but I don't mind it too much either. I'd actually find an assertion that last_entry is less than UNSIGNED_MAX more important because otherwise the loop below would be an endless one. :-) (Yes, it's pretty obvious that it's less than UNSIGNED_MAX because it's less than s->refcount_table_size, which is an unsigned int. I'm just trying to say something while I'm not sure exactly what I'm trying to say. Sorry.) Max > + for (i = 0; i <= last_entry; i++) { > if ((s->refcount_table[i] & REFT_OFFSET_MASK) && > overlaps_with(s->refcount_table[i] & REFT_OFFSET_MASK, > s->cluster_size)) { > @@ -2871,6 +2889,7 @@ int qcow2_change_refcount_order(BlockDriverState *bs, > int refcount_order, > /* Now update the rest of the in-memory information */ > old_reftable = s->refcount_table; > s->refcount_table = new_reftable; > + update_max_refcount_table_index(s); > > s->refcount_bits = 1 << refcount_order; > s->refcount_max = UINT64_C(1) << (s->refcount_bits - 1);
signature.asc
Description: OpenPGP digital signature