Peter Xu <pet...@redhat.com> writes:

> On Mon, Feb 19, 2024 at 04:44:57PM -0300, Fabiano Rosas wrote:
>> The QMP command query_migrate might see incorrect throughput numbers
>> if it runs after we've set the migration completion status but before
>> migration_calculate_complete() has updated s->total_time and s->mbps.
>> 
>> The migration status would show COMPLETED, but the throughput value
>> would be the one from the last iteration and not the one from the
>> whole migration. This will usually be a larger value due to the time
>> period being smaller (one iteration).
>> 
>> Move migration_calculate_complete() earlier so that the status
>> MIGRATION_STATUS_COMPLETED is only emitted after the final counters
>> update.
>> 
>> Signed-off-by: Fabiano Rosas <faro...@suse.de>
>> ---
>> CI run: https://gitlab.com/farosas/qemu/-/pipelines/1182405776
>> ---
>>  migration/migration.c | 10 ++++++----
>>  1 file changed, 6 insertions(+), 4 deletions(-)
>> 
>> diff --git a/migration/migration.c b/migration/migration.c
>> index ab21de2cad..7486d59da0 100644
>> --- a/migration/migration.c
>> +++ b/migration/migration.c
>> @@ -102,6 +102,7 @@ static int migration_maybe_pause(MigrationState *s,
>>                                   int new_state);
>>  static void migrate_fd_cancel(MigrationState *s);
>>  static bool close_return_path_on_source(MigrationState *s);
>> +static void migration_calculate_complete(MigrationState *s);
>>  
>>  static void migration_downtime_start(MigrationState *s)
>>  {
>> @@ -2746,6 +2747,7 @@ static void migration_completion(MigrationState *s)
>>          migrate_set_state(&s->state, MIGRATION_STATUS_ACTIVE,
>>                            MIGRATION_STATUS_COLO);
>>      } else {
>> +        migration_calculate_complete(s);
>>          migrate_set_state(&s->state, current_active_state,
>>                            MIGRATION_STATUS_COMPLETED);
>>      }
>> @@ -2784,6 +2786,7 @@ static void bg_migration_completion(MigrationState *s)
>>          goto fail;
>>      }
>>  
>> +    migration_calculate_complete(s);
>>      migrate_set_state(&s->state, current_active_state,
>>                        MIGRATION_STATUS_COMPLETED);
>>      return;
>> @@ -2993,12 +2996,15 @@ static void 
>> migration_calculate_complete(MigrationState *s)
>>      int64_t end_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
>>      int64_t transfer_time;
>>  
>> +    /* QMP could read from these concurrently */
>> +    bql_lock();
>>      migration_downtime_end(s);
>>      s->total_time = end_time - s->start_time;
>>      transfer_time = s->total_time - s->setup_time;
>>      if (transfer_time) {
>>          s->mbps = ((double) bytes * 8.0) / transfer_time / 1000;
>>      }
>> +    bql_unlock();
>
> The lock is not needed?
>
> AFAIU that was needed because of things like runstate_set() rather than
> setting of these fields.
>

Don't we need to keep the total_time and mbps update atomic? Otherwise
query-migrate might see (say) total_time=0 and mbps=<correct value> or
total_time=<correct value> and mbps=<previous value>.

Also, what orders s->mbps update before the s->state update? I'd say we
should probably hold the lock around the whole total_time,mbps,state
update.

I'm not entirely sure, what do you think?

> See migration_update_counters() where it also updates mbps without holding
> a lock.

Here it might be less important since it's the middle of the migration,
there will proabably be more than one query-migrate which would see the
correct values.

>
> Other than that it looks all good.
>
>>  }
>>  
>>  static void update_iteration_initial_status(MigrationState *s)
>> @@ -3145,7 +3151,6 @@ static void migration_iteration_finish(MigrationState 
>> *s)
>>      bql_lock();
>>      switch (s->state) {
>>      case MIGRATION_STATUS_COMPLETED:
>> -        migration_calculate_complete(s);
>>          runstate_set(RUN_STATE_POSTMIGRATE);
>>          break;
>>      case MIGRATION_STATUS_COLO:
>> @@ -3189,9 +3194,6 @@ static void 
>> bg_migration_iteration_finish(MigrationState *s)
>>      bql_lock();
>>      switch (s->state) {
>>      case MIGRATION_STATUS_COMPLETED:
>> -        migration_calculate_complete(s);
>> -        break;
>> -
>>      case MIGRATION_STATUS_ACTIVE:
>>      case MIGRATION_STATUS_FAILED:
>>      case MIGRATION_STATUS_CANCELLED:
>> -- 
>> 2.35.3
>> 

Reply via email to