On 06/15/2018 08:33 AM, Vladimir Sementsov-Ogievskiy wrote:
> 15.06.2018 15:06, Dr. David Alan Gilbert wrote:
>> * Vladimir Sementsov-Ogievskiy (vsement...@virtuozzo.com) wrote:
>>> Invalidate cache before source start in case of failed migration.
>>>
>>> Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@virtuozzo.com>
>> Why doesn't the code at the bottom of migration_completion,
>> fail_invalidate: and the code in migrate_fd_cancel handle this?
>>
>> What case did you see it in that those didn't handle?
>> (Also I guess it probably should set s->block_inactive = false)
>
> on source I see:
>
> 81392@1529065750.766289:migrate_set_state new state 7
> 81392@1529065750.766330:migration_thread_file_err
> 81392@1529065750.766332:migration_thread_after_loop
>
> so, we are leaving loop on
> if (qemu_file_get_error(s->to_dst_file)) {
> migrate_set_state(&s->state, current_active_state,
> MIGRATION_STATUS_FAILED);
> trace_migration_thread_file_err();
> break;
> }
>
> and skip migration_completion()
>
OK, so you're saying that in migration_thread;
`
thr_error = migration_detect_error(s);
if (thr_error == MIG_THR_ERR_FATAL) {
/* Stop migration */
break;
}
`
is triggering, so then we jump right away to
`
trace_migration_thread_after_loop();
migration_iteration_finish(s);
`
and in so doing, we fail before we get a chance for
migration_iteration_run to call migration_completion, which was
otherwise the mechanism that invalidated the images.
Seems like a legitimate concern that early failures versus late failures
trigger different error pathways.
David?
--js
>
>>
>> Dave
>>
>>> ---
>>>
>>> migration/migration.c | 9 ++++++++-
>>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/migration/migration.c b/migration/migration.c
>>> index 1e99ec9b7e..8f39e0dc02 100644
>>> --- a/migration/migration.c
>>> +++ b/migration/migration.c
>>> @@ -2806,7 +2806,14 @@ static void
>>> migration_iteration_finish(MigrationState *s)
>>> case MIGRATION_STATUS_FAILED:
>>> case MIGRATION_STATUS_CANCELLED:
>>> if (s->vm_was_running) {
>>> - vm_start();
>>> + Error *local_err = NULL;
>>> + bdrv_invalidate_cache_all(&local_err);
>>> + if (local_err) {
>>> + error_reportf_err(local_err, "Can't invalidate disks
>>> before "
>>> + "source vm start");
>>> + } else {
>>> + vm_start();
>>> + }
>>> } else {
>>> if (runstate_check(RUN_STATE_FINISH_MIGRATE)) {
>>> runstate_set(RUN_STATE_POSTMIGRATE);
>>> --
>>> 2.11.1
>>>
>> --
>> Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
>
>