Peter Xu <pet...@redhat.com> writes:

> Firstly, the "Paused" state was added in the wrong place before. The state
> machine section was describing PostcopyState, rather than MigrationStatus.
> Drop the Paused state descriptions.
>
> Then in the postcopy recover session, add more information on the state
> machine for MigrationStatus in the lines.  Add the new RECOVER_SETUP phase.
>
> Signed-off-by: Peter Xu <pet...@redhat.com>

Reviewed-by: Fabiano Rosas <faro...@suse.de>

> ---
>  docs/devel/migration/postcopy.rst | 31 ++++++++++++++++---------------
>  1 file changed, 16 insertions(+), 15 deletions(-)
>
> diff --git a/docs/devel/migration/postcopy.rst 
> b/docs/devel/migration/postcopy.rst
> index 6c51e96d79..a15594e11f 100644
> --- a/docs/devel/migration/postcopy.rst
> +++ b/docs/devel/migration/postcopy.rst
> @@ -99,17 +99,6 @@ ADVISE->DISCARD->LISTEN->RUNNING->END
>      (although it can't do the cleanup it would do as it
>      finishes a normal migration).
>  
> - - Paused
> -
> -    Postcopy can run into a paused state (normally on both sides when
> -    happens), where all threads will be temporarily halted mostly due to
> -    network errors.  When reaching paused state, migration will make sure
> -    the qemu binary on both sides maintain the data without corrupting
> -    the VM.  To continue the migration, the admin needs to fix the
> -    migration channel using the QMP command 'migrate-recover' on the
> -    destination node, then resume the migration using QMP command 'migrate'
> -    again on source node, with resume=true flag set.
> -
>   - End
>  
>      The listen thread can now quit, and perform the cleanup of migration
> @@ -221,7 +210,8 @@ paused postcopy migration.
>  
>  The recovery phase normally contains a few steps:
>  
> -  - When network issue occurs, both QEMU will go into PAUSED state
> +  - When network issue occurs, both QEMU will go into **POSTCOPY_PAUSED**
> +    migration state.
>  
>    - When the network is recovered (or a new network is provided), the admin
>      can setup the new channel for migration using QMP command
> @@ -229,9 +219,20 @@ The recovery phase normally contains a few steps:
>  
>    - On source host, the admin can continue the interrupted postcopy
>      migration using QMP command 'migrate' with resume=true flag set.
> -
> -  - After the connection is re-established, QEMU will continue the postcopy
> -    migration on both sides.
> +    Source QEMU will go into **POSTCOPY_RECOVER_SETUP** state trying to
> +    re-establish the channels.
> +
> +  - When both sides of QEMU successfully reconnects using a new or fixed up

s/reconnects/reconnect

I can touch it up when queueing

> +    channel, they will go into **POSTCOPY_RECOVER** state, some handshake
> +    procedure will be needed to properly synchronize the VM states between
> +    the two QEMUs to continue the postcopy migration.  For example, there
> +    can be pages sent right during the window when the network is
> +    interrupted, then the handshake will guarantee pages lost in-flight
> +    will be resent again.
> +
> +  - After a proper handshake synchronization, QEMU will continue the
> +    postcopy migration on both sides and go back to **POSTCOPY_ACTIVE**
> +    state.  Postcopy migration will continue.
>  
>  During a paused postcopy migration, the VM can logically still continue
>  running, and it will not be impacted from any page access to pages that

Reply via email to