Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

Laszlo Ersek Tue, 22 Jan 2013 05:16:24 -0800

On 01/22/13 11:25, Kevin Wolf wrote:
> Am 18.12.2012 10:46, schrieb Philipp Hahn:
>> Hello Kevin, hello Michael,
>>
>> On Wednesday 12 December 2012 17:54:58 Kevin Wolf wrote:
>>> Am 12.12.2012 15:09, schrieb Philipp Hahn:
>>>> Am Mittwoch 12 Dezember 2012 14:41:49 schrieb Kevin Wolf:
>>>>> As you can see in the commit message of that patch I was convinced that
>>>>> no bug did exist in practice and this was only dangerous with respect to
>>>>> future changes. Therefore my first question is if you're using an
>>>>> unmodified upstream qemu or if some backported patches are applied to
>>>>> it? If it's indeed unmodified, we should probably review the code once
>>>>> again to understand why it makes a difference.
>>>>
>>>> This were all unmodified versions directly from git between
>>>> "qemu-kvm-1.1.0" and "qemu-kvm-1.2.0"
>>>>
>>>> "git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c" works,
>>>> "git checkout b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c~1" is broken.
>>>> "git checkout qemu-kvm-1.1.2"  is broken,
>>>> "git checkout qemu-kvm-1.1.2 ; git cherry-pick
>>>> b7ab0fea37c15ca9e249c42c46f5c48fd1a0943c"  works
>>>
>>> Ok, thanks for clarifying. Then I must have missed some interesting case
>>> while doing the patch.
>>
>> I think I found your missing link:
>> After filling in "QCowL2Meta *m", that request ist queued:
>>   QLIST_INSERT_HEAD(&s->cluster_allocs, m, next_in_flight);
>> do prevent double allocating the same cluster for overlapping requests, 
>> which 
>> is checked in do_alloc_cluster_offset().
>> I guess that since the sector count was wrong, the overlap detection didn't 
>> work and the two concurrent write requests to the same cluster overwrote 
>> each 
>> other.
> 
> I'm still not seeing it. The overlap detection code doesn't use
> m->nb_available at all, so why would it make a difference?
> 
> I guess I need some people who can decently review code - Laszlo, Paolo,
> any idea why upstream commit b7ab0fea does change the behaviour,
> contrary to what I claimed in the commit message?


As I understand it, the question is not whether b7ab0fea works or not
(it does), the question is whether *without* b7ab0fea there is a real
problem (user visible bug).

Commit b7ab0fea decreased "avail_sectors" by

    keep_clusters << (s->cluster_bits - BDRV_SECTOR_BITS)

which should make a difference for "m->nb_available" in two cases:

(a)

    avail_sectors_patched   <
    requested_sectors       <=
    avail_sectors_unpatched

(ie. the decrease in "avail_sectors" flipped MIN to prefer it over
requested_sectors, lowering m->nb_available)

(b)

    avail_sectors_patched   <
    avail_sectors_unpatched <=
    requested_sectors

(ie. MIN had already preferred "avail_sectors", and now it went lower).


It seems that

  1 << (s->cluster_bits - BDRV_SECTOR_BITS) == s->cluster_sectors    [1]

Therefore both "avail_sectors_patched" and "avail_sectors_unpatched" are
an integral multiple of "s->cluster_sectors". Whenever MIN() selects
either of them, the condition in qcow2_alloc_cluster_link_l2() will
evaluate to false.


In case (b) there seems to be no difference between patched & unpatched
in this regard: MIN() always selects an integral multiple, and so
copy_sectors() is never invoked.

In case (a), the patched version skips copy_sectors(), while the
unpatched version invokes it.

I'm not sure if case (a) is possible at all -- let's expand it:

    avail_sectors_patched   <
    requested_sectors       <=
    avail_sectors_unpatched

Substitute the assigned values:

                  nb_cluster  << (s->cluster_bits - BDRV_SECTOR_BITS) <
    n_end - keep_clusters * s->cluster_sectors                        <=
(keep_clusters + nb_clusters) << (s->cluster_bits - BDRV_SECTOR_BITS)

Using [1], replace the shifts with multiplication:

                    nb_cluster  * s->cluster_sectors <
          n_end - keep_clusters * s->cluster_sectors <=
  (keep_clusters + nb_clusters) * s->cluster_sectors

Distribute the addition:

                    nb_cluster  * s->cluster_sectors   <
          n_end - keep_clusters * s->cluster_sectors   <=
                  keep_clusters * s->cluster_sectors +
                    nb_clusters * s->cluster_sectors

N :=   nb_cluster  * s->cluster_sectors [bytes]
K := keep_clusters * s->cluster_sectors [bytes]

                  N  <  n_end - K  <=    K + N

Add K

              K + N  <  n_end      <=  2*K + N                       [2]

I have no clue about qcow2, but *each one* of K and N seem to be a
function of *both* n_end and the current "qcow2 disk state" (ie. sectors
being allocated vs. just referenced). IOW, I suspect that case (a) is
possible for some (disk state, n_end) pairs. And case (a) -- if indeed
possible -- seems to contradict the commit message:

  allocation. A COW occurs only if the request wasn't cluster aligned,
                                       ^^^^^^^    ^^^^^^^^^^^^^^^^^^^
  which in turn would imply that requested_sectors was less than
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  avail_sectors (both in the original and in the fixed version). In this
  ^^^^^^^^^^^^^                           ^^^^^^^^^^^^^^^^^^^^

In case (a) -- expanded as [2] --, "requested_sectors" is greater than
"avail_sectors_patched", and less than or equal to
"avail_sectors_unpatched".

Laszlo

Re: [Qemu-devel] [BUG] qemu-1.1.2 [FIXED-BY] qcow2: Fix avail_sectors in cluster allocation code

Reply via email to