09.09.2019 13:47, Kevin Wolf wrote: > Am 09.09.2019 um 12:13 hat Vladimir Sementsov-Ogievskiy geschrieben: >> Hi! >> >> I have a (may be stupid) question: what is BDRVQcow2State.lock for and when >> should it be locked? >> >> >> I faced SIGSEGV here: >> >> #0 qcow2_process_discards (bs=bs@entry=0x564b93bc8000, ret=ret@entry=0) at >> block/qcow2-refcount.c:737 >> #1 0x0000564b90e9f15f in qcow2_cluster_discard (bs=bs@entry=0x564b93bc8000, >> offset=0, offset@entry=3062890496, bytes=bytes@entry=134217728, >> type=type@entry=QCOW2_DISCARD_REQUEST, >> full_discard=full_discard@entry=false) at block/qcow2-cluster.c:1853 >> #2 0x0000564b90e8f720 in qcow2_co_pdiscard (bs=0x564b93bc8000, >> offset=3062890496, bytes=134217728) at block/qcow2.c:3749 >> #3 0x0000564b90ec565d in bdrv_co_pdiscard (bs=0x564b93bc8000, >> offset=3062890496, bytes=134217728) at block/io.c:2939 >> #4 0x0000564b90eb5c04 in blk_aio_pdiscard_entry (opaque=0x564b94f968c0) at >> block/block-backend.c:1527 >> #5 0x0000564b90f681ea in coroutine_trampoline (i0=<optimized out>, >> i1=<optimized out>) at util/coroutine-ucontext.c:116 >> >> SIGSEGV is on QTAILQ_REMOVE in qcow2_process_discards: >> (gdb) list >> 732 { >> 733 BDRVQcow2State *s = bs->opaque; >> 734 Qcow2DiscardRegion *d, *next; >> 735 >> 736 QTAILQ_FOREACH_SAFE(d, &s->discards, next, next) { >> 737 QTAILQ_REMOVE(&s->discards, d, next); >> 738 >> 739 /* Discard is optional, ignore the return value */ >> 740 if (ret >= 0) { >> 741 bdrv_pdiscard(bs->file->bs, d->offset, d->bytes); >> >> >> (you see bs->file->bs, yes it's old code based on 2.12, but still, >> I need some help on the following) >> >> and problem is that d is already deleted: >> (gdb) p d->next >> $50 = {tqe_next = 0x564b94b0b140, tqe_prev = 0x0} >> >> Such problems may occur when there is an interleaving of such >> removing loops with other usage of the queue. And this is possible, >> as we call bdrv_pdiscard inside the loop which may yield. >> >> go to frame #5, and print co->caller stack: >> >> #0 0x0000564b90f68180 in qemu_coroutine_switch () >> #1 0x0000564b90f66c84 in qemu_aio_coroutine_enter () >> #2 0x0000564b90f50764 in aio_co_enter () >> #3 0x0000564b90f50ea9 in thread_pool_completion_bh () >> #4 0x0000564b90f500d1 in aio_bh_poll () >> #5 0x0000564b90f5360b in aio_poll () >> #6 0x0000564b90ec59cd in bdrv_pdiscard () >> #7 0x0000564b90e96a36 in qcow2_process_discards () >> #8 0x0000564b90e97785 in update_refcount () >> #9 0x0000564b90e96bdd in qcow2_free_clusters () >> #10 0x0000564b90ea29c7 in update_ext_header_and_dir () >> #11 0x0000564b90ea3a14 in qcow2_remove_persistent_dirty_bitmap () >> #12 0x0000564b90ce7bce in qmp_block_dirty_bitmap_remove () >> #13 0x0000564b90cf5390 in qmp_marshal_block_dirty_bitmap_remove () >> #14 0x0000564b90f46080 in qmp_dispatch () >> #15 0x0000564b90bedc74 in monitor_qmp_dispatch_one () >> #16 0x0000564b90bee04a in monitor_qmp_bh_dispatcher () >> #17 0x0000564b90f500d1 in aio_bh_poll () >> #18 0x0000564b90f53430 in aio_dispatch () >> #19 0x0000564b90f4ffae in aio_ctx_dispatch () >> #20 0x00007f0a8e3e9049 in g_main_context_dispatch () from >> /lib64/libglib-2.0.so.0 >> #21 0x0000564b90f52727 in main_loop_wait () >> #22 0x0000564b90ba0c07 in main () >> >> >> And this (at least partly) confirms my guess. >> >> So, my actual question is, what should be fixed here: >> >> 1. yielding in qcow2_process_discards, like this: >> >> --- a/block/qcow2-refcount.c >> +++ b/block/qcow2-refcount.c >> @@ -732,9 +732,13 @@ void qcow2_process_discards(BlockDriverState *bs, int >> ret) >> { >> BDRVQcow2State *s = bs->opaque; >> Qcow2DiscardRegion *d, *next; >> + QTAILQ_HEAD (, Qcow2DiscardRegion) discards; >> >> - QTAILQ_FOREACH_SAFE(d, &s->discards, next, next) { >> - QTAILQ_REMOVE(&s->discards, d, next); >> + discards = s->discards; >> + QTAILQ_INIT(&s->discards); >> + >> + QTAILQ_FOREACH_SAFE(d, &discards, next, next) { >> + QTAILQ_REMOVE(&discards, d, next); >> >> /* Discard is optional, ignore the return value */ >> if (ret >= 0) { > > I think this is not enough. > > If you don't hold s->lock here, concurrent requests could (re-)allocate > the clusters to be discarded and then you'd discard new data instead of > the old one.
I came to the same thought. > > So I believe that qcow2_process_discards() must always be called before > the image can be unlocked after adding something to the discard queue. > >> or >> 2. calling qcow2_remove_persistent_dirty_bitmap without taking lock, like >> this: >> >> --- a/block/qcow2-bitmap.c >> +++ b/block/qcow2-bitmap.c >> @@ -1359,8 +1359,8 @@ void >> qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs, >> { >> int ret; >> BDRVQcow2State *s = bs->opaque; >> - Qcow2Bitmap *bm; >> - Qcow2BitmapList *bm_list; >> + Qcow2Bitmap *bm = NULL; >> + Qcow2BitmapList *bm_list = NULL; >> >> if (s->nb_bitmaps == 0) { >> /* Absence of the bitmap is not an error: see explanation above >> @@ -1368,15 +1368,17 @@ void >> qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs, >> return; >> } >> >> + qemu_co_mutex_lock(&s->lock); >> + >> bm_list = bitmap_list_load(bs, s->bitmap_directory_offset, >> s->bitmap_directory_size, errp); >> if (bm_list == NULL) { >> - return; >> + goto out; >> } >> >> bm = find_bitmap_by_name(bm_list, name); >> if (bm == NULL) { >> - goto fail; >> + goto out; >> } >> >> QSIMPLEQ_REMOVE(bm_list, bm, Qcow2Bitmap, entry); >> @@ -1384,12 +1386,14 @@ void >> qcow2_remove_persistent_dirty_bitmap(BlockDriverState *bs, >> ret = update_ext_header_and_dir(bs, bm_list); >> if (ret < 0) { >> error_setg_errno(errp, -ret, "Failed to update bitmap extension"); >> - goto fail; >> + goto out; >> } >> >> free_bitmap_clusters(bs, &bm->table); >> >> -fail: >> +out: >> + qemu_co_mutex_unlock(&s->lock); >> + >> bitmap_free(bm); >> bitmap_list_free(bm_list); >> } >> >> And in this case, I'm afraid that locking is missed in some other >> bitmap related qcow2 codes :( > > Then we should probably add locking to all of it. I don't know enough > about the bitmap code to tell whether it's the full solution, and maybe > not all of the code actually needs it, but the bitmap management > functions are a slow path anyway, so just locking s->lock is certainly a > good idea. OK, I'll make a patch > >> or >> 3. Just backport from upstream John's fix 0a6c86d024c52b, which >> acquires aio context around bdrv_remove_persistent_dirty_bitmap call? >> Is it enough, or we still need locking inside qcow2? > > It protects only again other threads accessing the same data structures > concurrently, but when you yield, both the AioContext lock is dropped > and the same thread can start a concurrent operation, so this is not > enough. > > Kevin > Thanks a lot for quick response! -- Best regards, Vladimir