Roman Khapov <rkha...@yandex-team.ru> writes:

Hi Roman,

> This is resending of series 20240215082659.1378342-1-rkha...@yandex-team.ru,
> where patch subjects numbers were broken in patch 2/2.
>
> Sometimes, when migration fails, it is hard to find out
> the cause of the problems: you have to grep qemu logs.
> At the same time, there is MIGRATION event, which looks like
> suitable place to hold such error descriptions.

query-migrate after the event is received should be enough for giving
you the failure reason. We have that in error-desc. See commit
c94143e587 ("migration: Display error in query-migrate irrelevant of
status").

>
> To handle situation like this (maybe one day it will be useful
> for other MIGRATION statuses to have additional 'reason' strings),

I find it unlikely. There's no "reason" for making progress except
that's how things work. Only the exceptional (i.e. failure) statuses
would have a reason. Today that's FAILED only, maybe also
POSTCOPY_PAUSED.

> the general optional field 'reason' can be added.
>
> The series proposes next changes:
>
> 1. Adding optional 'reason' field of type str into
>    qapi/migration.json MIGRATION event
>
> 2. Passing some error description as reason for every place, which
>    sets migration state to MIGRATION_STATUS_FAILED
>
> After the series, MIGRATION event will looks like this:
> {"execute": "qmp_capabilities"}
> {"return": {}}
> {"event": "MIGRATION", "data": {"status": "setup"}}
> {"event": "MIGRATION", "data": {"status": "failed", "reason": "Failed to 
> connect to '/tmp/sock.sock': No such file or directory"}}
>
> Roman Khapov (2):
>   qapi/migration.json: add reason to MIGRATION event
>   migration: add error reason for failed MIGRATION events
>
>  migration/colo.c      |   6 +-
>  migration/migration.c | 128 ++++++++++++++++++++++++++++--------------
>  migration/migration.h |   5 +-
>  migration/multifd.c   |  10 ++--
>  migration/savevm.c    |  24 ++++----
>  qapi/migration.json   |   3 +-
>  6 files changed, 112 insertions(+), 64 deletions(-)

Please remember to run make check:

380/383 qemu:qtest+qtest-x86_64 / qtest-x86_64/migration-test ERROR
104.77s killed by signal 6 SIGABRT
―――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――――
stderr: Broken pipe ../tests/qtest/libqtest.c:204: kill_qemu() detected
QEMU death from signal 11 (Segmentation fault) (core dumped)


Most likely one of the new error_setg has broken postcopy recovery. Some
of those paths are not intended to trigger cleanup.

Reply via email to