Markus Armbruster <arm...@redhat.com> writes:

> Peter, Fabiano, I'd like to hear your opinion on the issue discussed
> below.
>
> Avihai Horon <avih...@nvidia.com> writes:
>
>> On 02/05/2024 13:22, Joao Martins wrote:
>>> External email: Use caution opening links or attachments
>>>
>>>
>>> On 01/05/2024 13:28, Avihai Horon wrote:
>>>> On 01/05/2024 14:50, Joao Martins wrote:
>>>>> External email: Use caution opening links or attachments
>>>>>
>>>>>
>>>>> On 30/04/2024 06:16, Avihai Horon wrote:
>>>>>> Emit VFIO device migration state change QAPI event when a VFIO device
>>>>>> changes its migration state. This can be used by management applications
>>>>>> to get updates on the current state of the VFIO device for their own
>>>>>> purposes.
>>>>>>
>>>>>> A new per VFIO device capability, "migration-events", is added so events
>>>>>> can be enabled only for the required devices. It is disabled by default.
>>>>>>
>>>>>> Signed-off-by: Avihai Horon<avih...@nvidia.com>
>>>>>> ---
>>>>>>    include/hw/vfio/vfio-common.h |  1 +
>>>>>>    hw/vfio/migration.c           | 44 +++++++++++++++++++++++++++++++++++
>>>>>>    hw/vfio/pci.c                 |  2 ++
>>>>>>    3 files changed, 47 insertions(+)
>>>>>>
>>>>>> diff --git a/include/hw/vfio/vfio-common.h 
>>>>>> b/include/hw/vfio/vfio-common.h
>>>>>> index b9da6c08ef..3ec5f2425e 100644
>>>>>> --- a/include/hw/vfio/vfio-common.h
>>>>>> +++ b/include/hw/vfio/vfio-common.h
>>>>>> @@ -115,6 +115,7 @@ typedef struct VFIODevice {
>>>>>>        bool no_mmap;
>>>>>>        bool ram_block_discard_allowed;
>>>>>>        OnOffAuto enable_migration;
>>>>>> +    bool migration_events;
>>>>>>        VFIODeviceOps *ops;
>>>>>>        unsigned int num_irqs;
>>>>>>        unsigned int num_regions;
>>>>>> diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c
>>>>>> index 06ae40969b..6bbccf6545 100644
>>>>>> --- a/hw/vfio/migration.c
>>>>>> +++ b/hw/vfio/migration.c
>>>>>> @@ -24,6 +24,7 @@
>>>>>>    #include "migration/register.h"
>>>>>>    #include "migration/blocker.h"
>>>>>>    #include "qapi/error.h"
>>>>>> +#include "qapi/qapi-events-vfio.h"
>>>>>>    #include "exec/ramlist.h"
>>>>>>    #include "exec/ram_addr.h"
>>>>>>    #include "pci.h"
>>>>>> @@ -80,6 +81,46 @@ static const char *mig_state_to_str(enum
>>>>>> vfio_device_mig_state state)
>>>>>>        }
>>>>>>    }
>>>>>>
>>>>>> +static VFIODeviceMigState
>>>>>> +mig_state_to_qapi_state(enum vfio_device_mig_state state)
>>>>>> +{
>>>>>> +    switch (state) {
>>>>>> +    case VFIO_DEVICE_STATE_STOP:
>>>>>> +        return QAPI_VFIO_DEVICE_MIG_STATE_STOP;
>>>>>> +    case VFIO_DEVICE_STATE_RUNNING:
>>>>>> +        return QAPI_VFIO_DEVICE_MIG_STATE_RUNNING;
>>>>>> +    case VFIO_DEVICE_STATE_STOP_COPY:
>>>>>> +        return QAPI_VFIO_DEVICE_MIG_STATE_STOP_COPY;
>>>>>> +    case VFIO_DEVICE_STATE_RESUMING:
>>>>>> +        return QAPI_VFIO_DEVICE_MIG_STATE_RESUMING;
>>>>>> +    case VFIO_DEVICE_STATE_RUNNING_P2P:
>>>>>> +        return QAPI_VFIO_DEVICE_MIG_STATE_RUNNING_P2P;
>>>>>> +    case VFIO_DEVICE_STATE_PRE_COPY:
>>>>>> +        return QAPI_VFIO_DEVICE_MIG_STATE_PRE_COPY;
>>>>>> +    case VFIO_DEVICE_STATE_PRE_COPY_P2P:
>>>>>> +        return QAPI_VFIO_DEVICE_MIG_STATE_PRE_COPY_P2P;
>>>>>> +    default:
>>>>>> +        g_assert_not_reached();
>>>>>> +    }
>>>>>> +}
>>>>>> +
>>>>>> +static void vfio_migration_send_state_change_event(VFIODevice *vbasedev)
>>>>>> +{
>>>>>> +    VFIOMigration *migration = vbasedev->migration;
>>>>>> +    const char *id;
>>>>>> +    Object *obj;
>>>>>> +
>>>>>> +    if (!vbasedev->migration_events) {
>>>>>> +        return;
>>>>>> +    }
>>>>>> +
>>>>> Shouldn't this leap frog migrate_events() capability instead of 
>>>>> introducing its
>>>>> vfio equivalent i.e.
>>>>>
>>>>>           if (!migrate_events()) {
>>>>>               return;
>>>>>           }
>>>>>
>>>>> ?
>>>>
>>>> I used a per VFIO device cap so the events can be fine tuned for each 
>>>> device
>>>> (maybe one device should send events while the other not).
>>>> This gives the most flexibility and I don't think it complicates the
>>>> configuration (one downside, though, is that it can't be enabled/disabled
>>>> dynamically during runtime).
>>>>
>>> Right.
>>>
>>>> I don't think events add much overhead, so if you prefer a global cap, I 
>>>> can
>>>> change it.
>>>> However, I'm not sure overloading the existing migrate_events() is valid?
>>>>
>>> migration 'events' capability just means we will have some migration events
>>> emited via QAPI monitor for: 1) general global status and 2) for each 
>>> migration
>>> pass (both with different event names=.
>>
>> Yes, it's already overloaded.
>>
>> In migration QAPI it says: "@events: generate events for each migration 
>> state change (since 2.4)".
>> This only refers to the MIGRATION event AFAIU.
>>
>> Later on (in QEMU 2.6), MIGRATION_PASS event was added and the events cap 
>> was overloaded for the first time (without changing @events description).
>>
>> Now we want to add yet another use for events capability, the VFIO migration 
>> state change events.
>>
>> I think what bothers me is the @events description, which is not accurate.
>> Maybe it should be changed to "@events: generate migration related events 
>> (since 2.4)"? However, I'm not sure if it's OK to do this.
>>
>>>   So the suggestion was just what feels a
>>> natural extension of that (...)
>>>
>>>>> Applications that don't understand the event string (migration related or 
>>>>> not)
>>>>> will just discard it (AIUI)
>>>
>>> (...) specially because of this as all these events have a different name.
>>>
>>> But overloading might not make sense for others IDK ... it was just a 
>>> suggestion
>>> :) not a strong preference
>>
>> Yes, I get your rationale.
>> I don't have a strong opinion either, so maybe let's see what other people 
>> think.

I don't see the need to tie this to the migration events
capability. Although there's overlap in the terms used, this seems more
like exposing a device feature via QEMU, then a migration event
per-se. The state changes also happen during moments unrelated to
migration (cover letter mentions start/stopping guest), so I assume we'd
like to keep those even if the management layer doesn't want to see
migration events.

Reply via email to