On Tue, Aug 27, 2024 at 01:28:02PM -0400, Stefan Hajnoczi wrote: > On Tue, 27 Aug 2024 at 13:24, David Hildenbrand <da...@redhat.com> wrote: > > > > On 27.08.24 18:52, Stefan Hajnoczi wrote: > > > On Tue, 27 Aug 2024 at 04:38, David Hildenbrand <da...@redhat.com> wrote: > > >> > > >> As reported by Peter, we might be leaking memory when removing the > > >> highest RAMBlock (in the weird ram_addr_t space), and adding a new one. > > >> > > >> We will fail to realize that we already allocated bitmaps for more > > >> dirty memory blocks, and effectively discard the pointers to them. > > >> > > >> Fix it by getting rid of last_ram_page() and simply storing the number > > >> of dirty memory blocks that have been allocated. We'll store the number > > >> of blocks along with the actual pointer to keep it simple. > > >> > > >> Looks like this leak was introduced as we switched from using a single > > >> bitmap_zero_extend() to allocating multiple bitmaps: > > >> bitmap_zero_extend() relies on g_renew() which should have taken care of > > >> this. > > >> > > >> Resolves: > > >> https://lkml.kernel.org/r/CAFEAcA-k7a+VObGAfCFNygQNfCKL=AfX6A4kScq=vssk0pe...@mail.gmail.com > > >> Reported-by: Peter Maydell <peter.mayd...@linaro.org> > > >> Fixes: 5b82b703b69a ("memory: RCU ram_list.dirty_memory[] for safe RAM > > >> hotplug") > > >> Cc: qemu-sta...@nongnu.org > > >> Cc: Stefan Hajnoczi <stefa...@redhat.com> > > >> Cc: Paolo Bonzini <pbonz...@redhat.com> > > >> Cc: Peter Xu <pet...@redhat.com> > > >> Cc: "Philippe Mathieu-Daudé" <phi...@linaro.org> > > >> Signed-off-by: David Hildenbrand <da...@redhat.com> > > >> --- > > >> include/exec/ramlist.h | 1 + > > >> system/physmem.c | 44 ++++++++++++++---------------------------- > > >> 2 files changed, 16 insertions(+), 29 deletions(-) > > >> > > >> diff --git a/include/exec/ramlist.h b/include/exec/ramlist.h > > >> index 2ad2a81acc..f2a965f293 100644 > > >> --- a/include/exec/ramlist.h > > >> +++ b/include/exec/ramlist.h > > >> @@ -41,6 +41,7 @@ typedef struct RAMBlockNotifier RAMBlockNotifier; > > >> #define DIRTY_MEMORY_BLOCK_SIZE ((ram_addr_t)256 * 1024 * 8) > > >> typedef struct { > > >> struct rcu_head rcu; > > >> + unsigned int num_blocks; > > > > > > The maximum amount of memory supported by unsigned int is: > > > (2 ^ 32 - 1) * 4KB * DIRTY_MEMORY_BLOCK_SIZE > > > = ~32 exabytes > > > > > > > True, should we simply use ram_addr_t ? > > Sounds good to me. In practice scalability bottlenecks are likely with > those memory sizes and it will be necessary to change how guest memory > is organized anyway. But it doesn't hurt to make this counter > future-proof.
IMHO it'll be nice to only use ram_addr_t when a variable is describing the ramblock address space (with an offset, or a length there). In this case it is a pure counter for how many bitmap chunks we allocated, so maybe "unsigned long" or "uint64_t" would suite more? Though I'd think "unsigned int" is good enough per the calculation Stefan provided. Reviewed-by: Peter Xu <pet...@redhat.com> Thanks, -- Peter Xu