Peter Xu <pet...@redhat.com> writes:

> On Thu, May 23, 2024 at 04:05:36PM -0300, Fabiano Rosas wrote:
>> We've been up until now cleaning up any file descriptors that have
>> been passed into QEMU and never duplicated[1,2]. A file descriptor
>> without duplicates indicates that no part of QEMU has made use of
>> it. This approach is starting to show some cracks now that we're
>> starting to consume fds from the migration code:
>> 
>> - Doing cleanup every time the last monitor connection closes works to
>>   reap unused fds, but also has the side effect of forcing the
>>   management layer to pass the file descriptors again in case of a
>>   disconnect/re-connect, if that happened to be the only monitor
>>   connection.
>> 
>>   Another side effect is that removing an fd with qmp_remove_fd() is
>>   effectively delayed until the last monitor connection closes.
>> 
>>   The reliance on mon_refcount is also problematic because it's racy.
>> 
>> - Checking runstate_is_running() skips the cleanup unless the VM is
>>   running and avoids premature cleanup of the fds, but also has the
>>   side effect of blocking the legitimate removal of an fd via
>>   qmp_remove_fd() if the VM happens to be in another state.
>> 
>>   This affects qmp_remove_fd() and qmp_query_fdsets() in particular
>>   because requesting a removal at a bad time (guest stopped) might
>>   cause an fd to never be removed, or to be removed at a much later
>>   point in time, causing the query command to continue showing the
>>   supposedly removed fd/fdset.
>> 
>> Note that file descriptors that *have* been duplicated are owned by
>> the code that uses them and will be removed after qemu_close() is
>> called. Therefore we've decided that the best course of action to
>> avoid the undesired side-effects is to stop managing non-duplicated
>> file descriptors.
>> 
>> 1- efb87c1697 ("monitor: Clean up fd sets on monitor disconnect")
>> 2- ebe52b592d ("monitor: Prevent removing fd from set during init")
>> 
>> Signed-off-by: Fabiano Rosas <faro...@suse.de>
>> ---
>>  monitor/fds.c              | 15 ++++++++-------
>>  monitor/hmp.c              |  2 --
>>  monitor/monitor-internal.h |  1 -
>>  monitor/qmp.c              |  2 --
>>  4 files changed, 8 insertions(+), 12 deletions(-)
>> 
>> diff --git a/monitor/fds.c b/monitor/fds.c
>> index 101e21720d..f7b98814fa 100644
>> --- a/monitor/fds.c
>> +++ b/monitor/fds.c
>> @@ -169,6 +169,11 @@ int monitor_get_fd(Monitor *mon, const char *fdname, 
>> Error **errp)
>>  
>>  static void monitor_fdset_free(MonFdset *mon_fdset)
>>  {
>> +    /*
>> +     * Only remove an empty fdset. The fds are owned by the user and
>> +     * should have been removed with qmp_remove_fd(). The dup_fds are
>> +     * owned by QEMU and should have been removed with qemu_close().
>> +     */
>>      if (QLIST_EMPTY(&mon_fdset->fds) && QLIST_EMPTY(&mon_fdset->dup_fds)) {
>>          QLIST_REMOVE(mon_fdset, next);
>>          g_free(mon_fdset);
>> @@ -189,9 +194,7 @@ static void monitor_fdset_cleanup(MonFdset *mon_fdset)
>>      MonFdsetFd *mon_fdset_fd_next;
>>  
>>      QLIST_FOREACH_SAFE(mon_fdset_fd, &mon_fdset->fds, next, 
>> mon_fdset_fd_next) {
>> -        if ((mon_fdset_fd->removed ||
>> -                (QLIST_EMPTY(&mon_fdset->dup_fds) && mon_refcount == 0)) &&
>> -                runstate_is_running()) {
>> +        if (mon_fdset_fd->removed) {
>
> I don't know the code well, but I like it.
>
>>              monitor_fdset_fd_free(mon_fdset_fd);
>>          }
>>      }
>> @@ -206,7 +209,7 @@ void monitor_fdsets_cleanup(void)
>>  
>>      QEMU_LOCK_GUARD(&mon_fdsets_lock);
>>      QLIST_FOREACH_SAFE(mon_fdset, &mon_fdsets, next, mon_fdset_next) {
>> -        monitor_fdset_cleanup(mon_fdset);
>> +        monitor_fdset_free(mon_fdset);
>>      }
>>  }
>>  
>> @@ -479,9 +482,7 @@ void monitor_fdset_dup_fd_remove(int dup_fd)
>>              if (mon_fdset_fd_dup->fd == dup_fd) {
>>                  QLIST_REMOVE(mon_fdset_fd_dup, next);
>>                  g_free(mon_fdset_fd_dup);
>> -                if (QLIST_EMPTY(&mon_fdset->dup_fds)) {
>> -                    monitor_fdset_cleanup(mon_fdset);
>> -                }
>> +                monitor_fdset_free(mon_fdset);
>
> This and above changes are not crystal clear to me where the _cleanup()
> does extra check "removed" and clean those fds.
>
> I think it'll make it easier for me to understand if this one and the next
> are squashed together.  But maybe it's simply because I didn't fully
> understand.

monitor_fdsets_cleanup() was doing three things previously:

1- Remove the removed=true fds. This is weird, but ok.

2- Remove fds from an fdset that has an empty dup_fds list, but only if
the guest is not running and the monitor is closed. This is problematic.

3- Remove/free fdsets that have become empty due to the above
removals. This is ok.

This patch covers (2), because that is the only change that has a
complex reasoning behind it and we need to document that without
conflating it with the rest of the changes.

Since (3) is still a reasonable thing to do, this patch merely keeps it
around, but using the helpers introduced in the previous patch.

The next patch removed the need for (1), making the removal immediate
instead of delayed. It has it's own much less complex reasoning, which
is: "we don't need to wait to remove the fd".

I hope this clarifies a bit. I would prefer if we kept this and the next
patch separate to avoid having a single patch that does too many
things. I hope to be as explicit as possible with the reason behind
these changes to avoid putting future people in the situation that we
are in now, i.e. having to guess at the reasons behind this code.

If you still think we should squash or if you have more questions, let
me know.

>>                  return;
>>              }
>>          }
>> diff --git a/monitor/hmp.c b/monitor/hmp.c
>> index 69c1b7e98a..460e8832f6 100644
>> --- a/monitor/hmp.c
>> +++ b/monitor/hmp.c
>> @@ -1437,11 +1437,9 @@ static void monitor_event(void *opaque, QEMUChrEvent 
>> event)
>>              monitor_resume(mon);
>>          }
>>          qemu_mutex_unlock(&mon->mon_lock);
>> -        mon_refcount++;
>>          break;
>>  
>>      case CHR_EVENT_CLOSED:
>> -        mon_refcount--;
>>          monitor_fdsets_cleanup();
>>          break;
>>  
>> diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h
>> index 252de85681..cb628f681d 100644
>> --- a/monitor/monitor-internal.h
>> +++ b/monitor/monitor-internal.h
>> @@ -168,7 +168,6 @@ extern bool qmp_dispatcher_co_shutdown;
>>  extern QmpCommandList qmp_commands, qmp_cap_negotiation_commands;
>>  extern QemuMutex monitor_lock;
>>  extern MonitorList mon_list;
>> -extern int mon_refcount;
>>  
>>  extern HMPCommand hmp_cmds[];
>>  
>> diff --git a/monitor/qmp.c b/monitor/qmp.c
>> index a239945e8d..5e538f34c0 100644
>> --- a/monitor/qmp.c
>> +++ b/monitor/qmp.c
>> @@ -466,7 +466,6 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent 
>> event)
>>          data = qmp_greeting(mon);
>>          qmp_send_response(mon, data);
>>          qobject_unref(data);
>> -        mon_refcount++;
>>          break;
>>      case CHR_EVENT_CLOSED:
>>          /*
>> @@ -479,7 +478,6 @@ static void monitor_qmp_event(void *opaque, QEMUChrEvent 
>> event)
>>          json_message_parser_destroy(&mon->parser);
>>          json_message_parser_init(&mon->parser, handle_qmp_command,
>>                                   mon, NULL);
>> -        mon_refcount--;
>>          monitor_fdsets_cleanup();
>>          break;
>>      case CHR_EVENT_BREAK:
>
> I like this too when mon_refcount can be dropped.  It turns out I like this
> patch and the next a lot, and I hope nothing will break.
>
> Aside, you seem to have forgot removal of the "int mon_refcount" in
> monitor.c.

Yes, I'll fix that. Thanks.

Reply via email to