On 17.07.19 11:57, Michael S. Tsirkin wrote: > On Wed, Jul 17, 2019 at 10:42:55AM +0200, David Hildenbrand wrote: >> We are using the wrong functions to set/clear bits, effectively touching >> multiple bits, writing out of range of the bitmap, resulting in memory >> corruptions. We have to use set_bit()/clear_bit() instead. >> >> Can easily be reproduced by starting a qemu guest on hugetlbfs memory, >> inflating the balloon. QEMU crashes. This never could have worked >> properly - especially, also pages would have been discarded when the >> first sub-page would be inflated (the whole bitmap would be set). >> >> While testing I realized, that on hugetlbfs it is pretty much impossible >> to discard a page - the guest just frees the 4k sub-pages in random order >> most of the time. I was only able to discard a hugepage a handful of >> times - so I hope that now works correctly. >> >> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < >> host page size") >> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption >> with inflates & deflates") >> Cc: qemu-sta...@nongnu.org #v4.0.0 >> Cc: Stefan Hajnoczi <stefa...@redhat.com> >> Cc: David Gibson <da...@gibson.dropbear.id.au> >> Cc: Michael S. Tsirkin <m...@redhat.com> >> Cc: Igor Mammedov <imamm...@redhat.com> >> Signed-off-by: David Hildenbrand <da...@redhat.com> >> --- >> hw/virtio/virtio-balloon.c | 10 ++++------ >> 1 file changed, 4 insertions(+), 6 deletions(-) >> >> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c >> index e85d1c0d5c..669067d661 100644 >> --- a/hw/virtio/virtio-balloon.c >> +++ b/hw/virtio/virtio-balloon.c >> @@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon, >> balloon->pbp->base = host_page_base; >> } >> >> - bitmap_set(balloon->pbp->bitmap, >> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE, >> - subpages); >> + set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE, >> + balloon->pbp->bitmap); >> >> if (bitmap_full(balloon->pbp->bitmap, subpages)) { >> /* We've accumulated a full host page, we can actually discard >> @@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon, >> * for a guest to do this in practice, but handle it anyway, >> * since getting it wrong could mean discarding memory the >> * guest is still using. */ >> - bitmap_clear(balloon->pbp->bitmap, >> - (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE, >> - subpages); >> + clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE, >> + balloon->pbp->bitmap); >> >> if (bitmap_empty(balloon->pbp->bitmap, subpages)) { >> g_free(balloon->pbp); > > I also started to wonder about this: > > if (!balloon->pbp) { > /* Starting on a new host page */ > size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long); > balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen); > balloon->pbp->rb = rb; > balloon->pbp->base = host_page_base; > } > > Is keeping a pointer to a ram block like this safe? what if the ramblock > gets removed? >
David added if (balloon->pbp && (rb != balloon->pbp->rb ) ... So in case the rb changes (IOW replaced - delete old one, new one added), we reset the data. After a ram block was deleted, there will be no more deflation requests coming in for it. This should be fine I guess. However, there is another possible issue: Resets. If the balloon was inflated and we reboot, the old balloon->pbp will remain intact. The guest will continue using all memory until virtio-balloon guest driver comes up. If the stars align, it could happen that new inflation requests by the guests will result in a discard of a big chunk, although the guest is re-using some parts already again. We would have to reset balloon->pbp during virtio_balloon_device_reset(). -- Thanks, David / dhildenb