On Thu, 27 Jun 2024 at 14:40, Akihiko Odaki <akihiko.od...@daynix.com> wrote:
>
> A memory region does not use their own reference counters, but instead
> piggybacks on another QOM object, "owner" (unless the owner is not the
> memory region itself). When creating a subregion, a new reference to the
> owner of the container must be created. However, if the subregion is
> owned by the same QOM object, this result in a self-reference, and make
> the owner immortal. Avoid such a self-reference.
>
> Signed-off-by: Akihiko Odaki <akihiko.od...@daynix.com>
> ---
>  system/memory.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/system/memory.c b/system/memory.c
> index 74cd73ebc78b..949f5016a68d 100644
> --- a/system/memory.c
> +++ b/system/memory.c
> @@ -2638,7 +2638,10 @@ static void 
> memory_region_update_container_subregions(MemoryRegion *subregion)
>
>      memory_region_transaction_begin();
>
> -    memory_region_ref(subregion);
> +    if (mr->owner != subregion->owner) {
> +        memory_region_ref(subregion);
> +    }
> +
>      QTAILQ_FOREACH(other, &mr->subregions, subregions_link) {
>          if (subregion->priority >= other->priority) {
>              QTAILQ_INSERT_BEFORE(other, subregion, subregions_link);
> @@ -2696,7 +2699,11 @@ void memory_region_del_subregion(MemoryRegion *mr,
>          assert(alias->mapped_via_alias >= 0);
>      }
>      QTAILQ_REMOVE(&mr->subregions, subregion, subregions_link);
> -    memory_region_unref(subregion);
> +
> +    if (mr->owner != subregion->owner) {
> +        memory_region_unref(subregion);
> +    }
> +
>      memory_region_update_pending |= mr->enabled && subregion->enabled;
>      memory_region_transaction_commit();
>  }

I was having another look at leaks this week, and I tried
this patch to see how many of the leaks I was seeing it
fixed. I found however that for arm it results in an
assertion when the device-introspection-test exercises
the "imx7.analog" device. By-hand repro:

$ ./build/asan/qemu-system-aarch64 -display none -machine none -accel
qtest -monitor stdio
==712838==WARNING: ASan doesn't fully support makecontext/swapcontext
functions and may produce false positives in some cases!
QEMU 9.0.92 monitor - type 'help' for more information
(qemu) device_add imx7.analog,help
qemu-system-aarch64: ../../system/memory.c:1777: void
memory_region_finalize(Object *): Assertion `!mr->container' failed.
Aborted (core dumped)

It may be well be that this is a preexisting bug that's only
exposed by this refcount change causing us to actually try
to dispose of the memory regions.

I think that what's happening here is that the device
object has multiple MemoryRegions, each of which is a child
QOM property. One of these MRs is a "container MR", and the
other three are actual-content MRs which the device put into
the container when it created them. When we deref the device,
we go through all the child QOM properties unparenting and
unreffing them. However, there's no particular ordering
here, and it happens that we try to unref one of the
actual-content MRs first. That MR is still inside the
container MR, so we hit the assert. If we had happened to
unref the container MR first then memory_region_finalize()
would have removed all the subregions from it, avoiding
the problem.

I'm not sure what the best fix would be here -- that assert
is there as a guard that the region isn't visible in
any address space, so maybe it needs to be made a bit
cleverer about the condition it checks? e.g. in this
example although mr->container is not NULL,
mr->container->container is NULL. Or we could check
whether the mr->container->owner is the same as the
mr->owner and allow a non-NULL mr->container in that case.
I don't know this subsystem well enough so I'm just
making random stabs here, though.

thanks
-- PMM

Reply via email to