On Wed, Oct 23, 2024 at 06:03:36PM -0300, Fabiano Rosas wrote:
> Right, but consider that it's easy enough for someone to look for a
> global object to write in the code, find the wrong one and just use
> it. It would then be up to reviewers to catch the mistake.
> 
> Look at this:
> 
>   bool migration_is_idle(void)
>   {
>       MigrationState *s = current_migration;
>       ...
>   }

This is actually something I overlooked (and just fixed it up before this
email..).. so yah I think it's too trivial indeed.

>   
>   bool migration_is_active(void)
>   {
>       MigrationState *s = global_migration;
>       ...
>   }
> 
> Of course we'll get this wrong at some point.
> 
> Also, if in the future someone decides to call migration_is_idle() from
> outside migration/, we'd be not only adding the burden to change the
> variable, but also the functional change at shutdown time.
> 
> If we're to carry on with this idea, we'll need to play with some
> headers to properly isolate these two usages. Something like:
> 
> migration.h:
>   bool migration_is_active_internal(MigrationState *s);
>   void set_global_obj(MigrationState *obj);
> 
> migration.c:
>   bool migration_is_active_internal(MigrationState *s)
>   {
>   }
> 
>   void migration_object_init(void)
>   {
>       set_global_object(MIGRATION_OBJ(object_new(TYPE_MIGRATION)));
>   }
> 
> exports.h:
>   #include migration.h
>   bool migration_is_active(void);
> 
> exports.c:
>   static MigrationState *global_migration;
>   
>   void set_global_obj(MigrationState *obj)
>   {
>       global_migration = obj;
>   }
>   
>   bool migration_is_active(void)
>   {
>       return migration_is_active_internal(global_migration);
>   }
> 
> That way, the internal code never calls the exported functions with
> global, always with current.

Yes if so we'll need this if to make it clean.

Now I'm thinking whether we can make it easier, by using a mutex to protect
current_migration accesses, but only when outside migration/ (so migration
thread's refcount will make sure the migration code has safe access to the
variable).  Tricky, but maybe working.

The 1st thing we may want to fix is, we never clear current_migration, but
QEMU does free the object after all refcounts released.. it means when
accessed after freed it's UAF and it'll be harder to debug (comparing to
NULL deref).  So the 1st thing is to clear current_migration properly,
probably.

The only possible place to reset current_migration is in finalize() (as
proposed in this series), because it can be either main / migration thread
that does the last unref and invoke finalize(), there's no function we can
clear it manually besides the finalize().

But then this leads me to the QOM unit test crash, just noticing that QEMU
has device-introspect-test which can dynamically create a migration object
via qom-list-properties QMP command.

We'll need to think about how to declare a class that can only be initiated
once.  Things like INTERFACE_INTERNAL can potentially hide a class (per we
talked just now), but you were right that could make qom-set harder to be
applicable to migration some day.  The other approach can be
INTERFACE_SINGLETON so it should still apply to qom-set /
qom-list-properties / ... but it can't be created more than once.  Either
of them needs more thoughts as prior work to allow mutex to protect
current_migration.

I'll think about it tomorrow to see what I'll propose next.

-- 
Peter Xu


Reply via email to