Re: [Spice-devel] changing the timing of spice client linking in migration (RHBZ #725009)

Yonit Halperin Wed, 17 Aug 2011 12:15:29 -0700

On 08/17/2011 09:48 PM, Alon Levy wrote:

On Wed, Aug 17, 2011 at 10:19:27AM +0200, David Jaša wrote:

On 17.8.2011 09:47, Yonit Halperin wrote:

On 08/17/2011 01:54 AM, Marc-André Lureau wrote:

Hi


I am also unfamiliar with the migration code, in particular the qemu
->   qemu part. It seems to me that no spice transmission occurs, but
only guest memory. Is that correct? How is state of the channel
restored? Perhaps it doesn't need any state transmission, and the
connection of a client to the target is enough to revive the channels
after migration.

Hi,
You are right. No Spice transmission occurs. When we supported seamless
migration we did transfer spice info. This was done via the client: each
source server channel sent opaque migration data to the corresponding
channel in the client. The client channel passed this data to the dest
server and the dest server restored its state from this data.
We don't do it anymore since the current migration notifiers, afaik,
don't allow us to hold the target vm stopped till we make sure the
target server has restored its state. We can't prevent the target vm
from starting =>  we have synchronization problem. That is why
switch_host makes the client reconnect from scratch to the target.


Why can't we keep VM stopped? It seems to me that this is precisely the
root cause of https://bugzilla.redhat.com/show_bug.cgi?id=730645 , so
until we somehow solve it, our migration UX will be horrible.


I'm not sure I understand the question. The source VM gets stopped because it
is part of the migration process, irrespective of spice. There are two parts
to a migration, the live part where the source VM is running and the dest is 
stopped,
the closing/finishing part (not sure what the catchy name for it is) where both
vm's are stopped (this is required to send the last pages from the source 
guest),
and when it is done the destination vm is started.

So back to your question, the root cause of the regression is that we used to 
have
a secondary channel through the client (the primary channel is src qemu ->  dst 
qemu),
i.e. src spice-server ->  client ->  target spice-server.

That channel went away during the upstreaming process that Gerd led. So now we 
are
left only with the existing src qemu ->  dst qemu channel.

Hi,

This is not accurate. We didn't have such special channels. We used ourchannels as is. e.g.,, src display-channel send migration_data to theclient, and client sends it through the dest display channel to the destserver, which restores the display channel state using this data.We did have a special channel between the src spice server to the targetspice server. That channel was used to transform ticketing and otherauthentication information. Today we bypass the use of this channel bythe monitor command 'client_migrate_info'.

Afaik, the real missing link for seamless migration is the ability tohold the target vm stopped till we make sure the target spice serverrestored its state (from the data it got from the client). And as Istated above, if we can't prevent the target vm from starting => we canhave synchronization problem.I'm not familiar enough with the current migration code in qemu-devel toknow if it is possible and acceptable to hold the target vm whennotifying on migration completion,


Maybe it's possible to fix that, but it will take more thought and I think what 
Yonit
is proposing will bring a large part of it, namely reducing the reconnect 
latency. The
screen resize is another matter and can be solved purely in the client. The 
only thing
that is left is that we are losing the glzdict and cache state.

I think we are also losing the playback/record state, and all the surfaces.

David

Thanks a lot Yonit for your clear mail, it helps a lot.

:)


----- Original Message -----

qemu
=====
ui/spice-core::migration_state_notifier should handle MIG_STATE_ACTIVE
(for migration start) and MIG_STATE_ERROR/CANCELLED by calling
spice_server_migrate_start, and spice_server_migrate_end,
respectively.
These callbacks are currently declared in spice-experimental.h.


Contrary to Christophe, I don't think we should be afraid of using
those functions which have not been supported and used since quite
some time, afaik.

spice-server
=============
(A) Migration source side

* reds::spice_server_migrate_start:
send SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST.
We can't use SPICE_MSG_MAIN_MIGRATE_BEGIN since it doesn't
include the certificate information we need. But we can change it
to be identical to SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST.


For the same reason, I guess we can break messages, as long as proper
version/caps check are perform before client&   server receive them.

* reds::spice_server_migrate_end(completed)
- if (completed) =>   send SPICE_MSG_MIGRATE (flags=0) to all
connected channels (via Channel->migrate).
- if (!completed) =>   send SPICE_MSG_MAIN_MIGRATE_CANCEL


flags=0 == No NEED_FLUSH or DATA_TRANSFER. ok

(B) Migration target side

reds identifies it is a migration target when the client connects with
a
connection id != 0.
When linking to a migrated channels, a special treatment is required
(and not the support that is currently coded, since it is for seamless
migration).
For example:
- For the main channel, (1) network test is not required (2) no need
for
SPICE_MSG_MAIN_INIT, but rather SPICE_MSG_MAIN_MULTI_MEDIA_TIME and
SPICE_MSG_MAIN_MOUSE_MODE. This way we will also save all the agent
work
we preform when initializing the main channel in the client.
- For the display channel, we mustn't call
display_channel_wait_for_init
immediately upon link, but we should expect it to arrive later (for
setting cache and dictionary sizes).
- For playback channel: we still need to send the current playback
status, as opposed to seamless migration.


It looks to me like you would like to revive the seamless migration.

see the above explanation about seamless.

Wouldn't it be simpler to just leave connection id == 0 for now, and
do regular connection? Wouldn't that also work like "switch-host"?

Since we want to execute the linking to the target before the logical
target spice session starts, it is problematic not to let the target
server know it is a migration target: one of the problems is that upon
connection, the server display channel expect an INIT message from the
client. If timeout occurs, it disconnects. This is not desirable when
this is only the initial connection (the one triggered by
migrate_start), and the actual communication will start only when
migration ends.

Besides, it will also save us time and other artifacts that are a result
of executing a fresh connection. E.g., We can avoid the network
bandwidth test. Though I guess the bandwidth can change from one host to
another....But preforming the network test upon linking is more
complicated since the client doesn't listen yet to the socket...Maybe we
can neglect this for now and assume the same bandwidth for the new host?

Spice client
============
(A) SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST
client connects to the target, but still stays connected to the
source host. It doesn't listen to the target sockets.
The link message to the target contains the connection_id of the
connection to the source (this allows the target server to identify
itself as a migration target).
For this part we can use most of the code in the class Migrate in
red_client.cpp
(B) SPICE_MSG_MIGRATE
We can use the code in red_channel::handle_migrate to switch the
channels and start listening to the target.
The difference is that we should implement differently the virtual
method RedChannel::on_migrate.
(1) Each channel must reset all the dynamic data that depends on
the server. For example: the display channel
needs to destroy all the surfaces and reset the caches and
dictionary; The playback and record channel need to stop
the current session, if there is an active one, etc.
(2) Each channel should send to the server the initalization
information it normally sends in RedChannel::on_connect.

(C) SPICE_MSG_MAIN_MIGRATE_CANCEL
disconnects all the new channels. This code is already implemented
in spice-client.

spice-protocol(?)/Backward compatibility
=========================================
should we bounce spice protocol version, or use capabilities? (if we
change SPICE_MSG_MAIN_MIGRATE_BEGIN structue, there is no question).

New Spice-Server with old client will send only
SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST, and only when migration completes
(same as today).
New client with old Spice-server will disconnect the source and will
connect the target upon receiving SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST
(same as today).


Preferably, I would introduce SPICE_MSG_MAIN_MIGRATE_BEGIN2 etc. and
deprecate the older messages. From what I understand, we are now
preferably using caps rather than bumping protocol version.

o.k. But when do we actually change the protocol version?


cheers


_______________________________________________
Spice-devel mailing list
Spice-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/spice-devel


--

David Jaša

_______________________________________________
Spice-devel mailing list
Spice-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/spice-devel


_______________________________________________
Spice-devel mailing list
Spice-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/spice-devel

Re: [Spice-devel] changing the timing of spice client linking in migration (RHBZ #725009)

Reply via email to