Hailiang Zhang <zhang.zhanghaili...@huawei.com> writes: > On 2015/12/19 18:02, Markus Armbruster wrote: >> Copying qemu-block because this seems related to generalising block jobs >> to background jobs. >> > > Er, this event just used to help users to know what happened to VM with COLO > FT > on. If users get this event, they can make further check what's wrong, and > decide which side should take over the work. > >> zhanghailiang <zhang.zhanghaili...@huawei.com> writes: >> >>> If some errors happen during VM's COLO FT stage, it's important to >>> notify the users >>> of this event. Together with 'colo_lost_heartbeat', users can >>> intervene in COLO's >>> failover work immediately. >>> If users don't want to get involved in COLO's failover verdict, >>> it is still necessary to notify users that we exited COLO mode. >>> >>> Cc: Markus Armbruster <arm...@redhat.com> >>> Cc: Michael Roth <mdr...@linux.vnet.ibm.com> >>> Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> >>> Signed-off-by: Li Zhijian <lizhij...@cn.fujitsu.com> >>> --- >>> v11: >>> - Fix several typos found by Eric >>> >>> Signed-off-by: zhanghailiang <zhang.zhanghaili...@huawei.com> >>> --- >>> docs/qmp-events.txt | 17 +++++++++++++++++ >>> migration/colo.c | 11 +++++++++++ >>> qapi-schema.json | 16 ++++++++++++++++ >>> qapi/event.json | 17 +++++++++++++++++ >>> 4 files changed, 61 insertions(+) >>> >>> diff --git a/docs/qmp-events.txt b/docs/qmp-events.txt >>> index d2f1ce4..19f68fc 100644 >>> --- a/docs/qmp-events.txt >>> +++ b/docs/qmp-events.txt >>> @@ -184,6 +184,23 @@ Example: >>> Note: The "ready to complete" status is always reset by a BLOCK_JOB_ERROR >>> event. >>> >>> +COLO_EXIT >>> +--------- >>> + >>> +Emitted when VM finishes COLO mode due to some errors happening or >>> +at the request of users. >> >> How would the event's recipient distinguish between "due to error" and >> "at the user's request"? >> > > If they get this event with 'reason' is 'request', it is 'at the > user's request', > Or, it will be 'due to error' (The key for 'reason' will be 'error', > and we have an optional > error message which may help to figure out what happened.)
For what it's worth, block jobs use separate events BLOCK_JOB_CANCELLED and BLOCK_JOB_ERROR. >>> + >>> +Data: >>> + >>> + - "mode": COLO mode, primary or secondary side (json-string) >>> + - "reason": the exit reason, internal error or external >>> request. (json-string) >>> + - "error": error message (json-string, operation) >>> + >>> +Example: >>> + >>> +{"timestamp": {"seconds": 2032141960, "microseconds": 417172}, >>> + "event": "COLO_EXIT", "data": {"mode": "primary", "reason": "request" } } >>> + >> >> Pardon my ignorance again... Does "VM finishes COLO mode" means have >> some kind of COLO background job, and it just finished for whatever >> reason? >> > > As above, what i have said. > >> If yes, this COLO job could be an instance of the general background job >> concept we're trying to grow from the existing block job concept. >> >> I'm not asking you to rebase your work onto the background job >> infrastructure, not least for the simple reason that it doesn't exist, >> yet. But I think it would be fruitful to compare your COLO job >> management QMP interface with the one we have for block jobs. Not only >> may that avoid unnecessary inconsistency, it could also help shape the >> general background job interface. >> > > Interesting, i'm not quite familiar with this block background job > infrastructure. > If we consider COLO FT as a background job, we can certainly use it. I > will have a look > at it. Thanks! Let's avoid unnecessary differences between COLO and block job interfaces. Later on, we can hopefully make them both use a common background job infrastructure, and the smaller their differences are, the easier that'll be. [...]