Re: [Qemu-devel] [PATCH v4 2/3] qcow2: Use QcowCache

Kevin Wolf Mon, 24 Jan 2011 07:43:50 -0800

Am 24.01.2011 16:26, schrieb Stefan Hajnoczi:
> On Mon, Jan 24, 2011 at 2:54 PM, Kevin Wolf <kw...@redhat.com> wrote:
>> [ Re-adding qemu-devel to CC ]
>>
>> Am 24.01.2011 15:34, schrieb Stefan Hajnoczi:
>>> On Thu, Jan 20, 2011 at 5:10 PM, Kevin Wolf <kw...@redhat.com> wrote:
>>>> @@ -702,17 +622,30 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState 
>>>> *bs, QCowL2Meta *m)
>>>>
>>>>     if (m->nb_available & (s->cluster_sectors - 1)) {
>>>>         uint64_t end = m->nb_available & ~(uint64_t)(s->cluster_sectors - 
>>>> 1);
>>>> +        cow = true;
>>>>         ret = copy_sectors(bs, start_sect + end, cluster_offset + (end << 
>>>> 9),
>>>>                 m->nb_available - end, s->cluster_sectors);
>>>>         if (ret < 0)
>>>>             goto err;
>>>>     }
>>>>
>>>> -    /* update L2 table */
>>>> +    /*
>>>> +     * Update L2 table.
>>>> +     *
>>>> +     * Before we update the L2 table to actually point to the new 
>>>> cluster, we
>>>> +     * need to be sure that the refcounts have been increased and COW was
>>>> +     * handled.
>>>> +     */
>>>> +    if (cow) {
>>>> +        bdrv_flush(bs->file);
>>>
>>> Just bdrv_flush(bs->file) and not a refcounts cache flush?
>>
>> Refcounts and data need not to be ordered against each other. They only
>> must both be on disk when we write the L2 table.
> 
> Have I missed where refcounts actually get flushed from the cache out
> to the disk because bdrv_flush(bs->file) only syncs the file but
> doesn't write out dirty data held in cache.


The bdrv_flush isn't supposed to flush the refcounts, but the data
copied during COW (what you call pre/postfill in QED)

The refcounts are handled by the qcow2_cache_set_dependency below, so
that before writing the L2 tables we always write the refcounts first.

>>>> +    }
>>>> +
>>>> +    qcow2_cache_set_dependency(bs, s->l2_table_cache, 
>>>> s->refcount_block_cache);
>>>>     ret = get_cluster_table(bs, m->offset, &l2_table, &l2_offset, 
>>>> &l2_index);
>>>>     if (ret < 0) {
>>>>         goto err;
>>>>     }
>>>> +    qcow2_cache_entry_mark_dirty(s->l2_table_cache, l2_table);
>>>>
>>>>     for (i = 0; i < m->nb_clusters; i++) {
>>>>         /* if two concurrent writes happen to the same unallocated cluster
>>>> @@ -728,16 +661,9 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, 
>>>> QCowL2Meta *m)
>>>>                     (i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
>>>>      }
>>>>
>>>> -    /*
>>>> -     * Before we update the L2 table to actually point to the new 
>>>> cluster, we
>>>> -     * need to be sure that the refcounts have been increased and COW was
>>>> -     * handled.
>>>> -     */
>>>> -    bdrv_flush(bs->file);
>>>>
>>>> -    ret = write_l2_entries(bs, l2_table, l2_offset, l2_index, 
>>>> m->nb_clusters);
>>>> +    ret = qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
>>>>     if (ret < 0) {
>>>> -        qcow2_l2_cache_reset(bs);
>>>>         goto err;
>>>>     }
>>>>
>>>
>>> The function continues like this:
>>>
>>> /*
>>>  * If this was a COW, we need to decrease the refcount of the old cluster.
>>>  * Also flush bs->file to get the right order for L2 and refcount update.
>>>  */
>>> if (j != 0) {
>>>     bdrv_flush(bs->file);
>>>     for (i = 0; i < j; i++) {
>>>         qcow2_free_any_clusters(bs,
>>>             be64_to_cpu(old_cluster[i]) & ~QCOW_OFLAG_COPIED, 1);
>>>     }
>>> }
>>>
>>> Does bdrv_flush(bs->file) "get the right order for L2 and refcount
>>> update"?  We've just put an L2 table, should this be an L2 table
>>> flush?
>>
>> I agree, this looks wrong. What we need is a dependency to ensure that
>> first L2 is flushed and then the refcount block.
>> qcow2_free_any_clusters() calls update_refcount() indirectly, which
>> takes care of setting this dependency.
>>
>> So in the end I think it's just an unnecessary bdrv_flush. Makes sense?
> 
> I don't understand this fully.  I've noticed that the cache isn't the
> only mechanism for making changes to tables - there are functions like
> write_l2_entries() that directly write out parts of tables without
> honoring dependencies or using the dirty bit on the cache entry.  I
> probably need to look at this more carefully to fully understand
> whether or not it is correct.

No, the cache should be the only mechanism that is used for accessing L2
tables and refcount blocks. write_l2_entries() is an old function that
is removed by the patch.

Direct accesses should only be left for L1 tables and refcount tables.

Kevin

Re: [Qemu-devel] [PATCH v4 2/3] qcow2: Use QcowCache

Reply via email to