On 08/17/2011 09:48 PM, Alon Levy wrote:
On Wed, Aug 17, 2011 at 10:19:27AM +0200, David Jaša wrote:
On 17.8.2011 09:47, Yonit Halperin wrote:
On 08/17/2011 01:54 AM, Marc-André Lureau wrote:
Hi
I am also unfamiliar with the migration code, in particular the qemu
-> qemu part. It seems to me that no spice transmission occurs, but
only guest memory. Is that correct? How is state of the channel
restored? Perhaps it doesn't need any state transmission, and the
connection of a client to the target is enough to revive the channels
after migration.
Hi,
You are right. No Spice transmission occurs. When we supported seamless
migration we did transfer spice info. This was done via the client: each
source server channel sent opaque migration data to the corresponding
channel in the client. The client channel passed this data to the dest
server and the dest server restored its state from this data.
We don't do it anymore since the current migration notifiers, afaik,
don't allow us to hold the target vm stopped till we make sure the
target server has restored its state. We can't prevent the target vm
from starting => we have synchronization problem. That is why
switch_host makes the client reconnect from scratch to the target.
Why can't we keep VM stopped? It seems to me that this is precisely the
root cause of https://bugzilla.redhat.com/show_bug.cgi?id=730645 , so
until we somehow solve it, our migration UX will be horrible.
I'm not sure I understand the question. The source VM gets stopped because it
is part of the migration process, irrespective of spice. There are two parts
to a migration, the live part where the source VM is running and the dest is
stopped,
the closing/finishing part (not sure what the catchy name for it is) where both
vm's are stopped (this is required to send the last pages from the source
guest),
and when it is done the destination vm is started.
So back to your question, the root cause of the regression is that we used to
have
a secondary channel through the client (the primary channel is src qemu -> dst
qemu),
i.e. src spice-server -> client -> target spice-server.
That channel went away during the upstreaming process that Gerd led. So now we
are
left only with the existing src qemu -> dst qemu channel.
Hi,
This is not accurate. We didn't have such special channels. We used our
channels as is. e.g.,, src display-channel send migration_data to the
client, and client sends it through the dest display channel to the dest
server, which restores the display channel state using this data.
We did have a special channel between the src spice server to the target
spice server. That channel was used to transform ticketing and other
authentication information. Today we bypass the use of this channel by
the monitor command 'client_migrate_info'.
Afaik, the real missing link for seamless migration is the ability to
hold the target vm stopped till we make sure the target spice server
restored its state (from the data it got from the client). And as I
stated above, if we can't prevent the target vm from starting => we can
have synchronization problem.
I'm not familiar enough with the current migration code in qemu-devel to
know if it is possible and acceptable to hold the target vm when
notifying on migration completion,
Maybe it's possible to fix that, but it will take more thought and I think what
Yonit
is proposing will bring a large part of it, namely reducing the reconnect
latency. The
screen resize is another matter and can be solved purely in the client. The
only thing
that is left is that we are losing the glzdict and cache state.
I think we are also losing the playback/record state, and all the surfaces.
David
Thanks a lot Yonit for your clear mail, it helps a lot.
:)
----- Original Message -----
qemu
=====
ui/spice-core::migration_state_notifier should handle MIG_STATE_ACTIVE
(for migration start) and MIG_STATE_ERROR/CANCELLED by calling
spice_server_migrate_start, and spice_server_migrate_end,
respectively.
These callbacks are currently declared in spice-experimental.h.
Contrary to Christophe, I don't think we should be afraid of using
those functions which have not been supported and used since quite
some time, afaik.
spice-server
=============
(A) Migration source side
* reds::spice_server_migrate_start:
send SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST.
We can't use SPICE_MSG_MAIN_MIGRATE_BEGIN since it doesn't
include the certificate information we need. But we can change it
to be identical to SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST.
For the same reason, I guess we can break messages, as long as proper
version/caps check are perform before client& server receive them.
* reds::spice_server_migrate_end(completed)
- if (completed) => send SPICE_MSG_MIGRATE (flags=0) to all
connected channels (via Channel->migrate).
- if (!completed) => send SPICE_MSG_MAIN_MIGRATE_CANCEL
flags=0 == No NEED_FLUSH or DATA_TRANSFER. ok
(B) Migration target side
reds identifies it is a migration target when the client connects with
a
connection id != 0.
When linking to a migrated channels, a special treatment is required
(and not the support that is currently coded, since it is for seamless
migration).
For example:
- For the main channel, (1) network test is not required (2) no need
for
SPICE_MSG_MAIN_INIT, but rather SPICE_MSG_MAIN_MULTI_MEDIA_TIME and
SPICE_MSG_MAIN_MOUSE_MODE. This way we will also save all the agent
work
we preform when initializing the main channel in the client.
- For the display channel, we mustn't call
display_channel_wait_for_init
immediately upon link, but we should expect it to arrive later (for
setting cache and dictionary sizes).
- For playback channel: we still need to send the current playback
status, as opposed to seamless migration.
It looks to me like you would like to revive the seamless migration.
see the above explanation about seamless.
Wouldn't it be simpler to just leave connection id == 0 for now, and
do regular connection? Wouldn't that also work like "switch-host"?
Since we want to execute the linking to the target before the logical
target spice session starts, it is problematic not to let the target
server know it is a migration target: one of the problems is that upon
connection, the server display channel expect an INIT message from the
client. If timeout occurs, it disconnects. This is not desirable when
this is only the initial connection (the one triggered by
migrate_start), and the actual communication will start only when
migration ends.
Besides, it will also save us time and other artifacts that are a result
of executing a fresh connection. E.g., We can avoid the network
bandwidth test. Though I guess the bandwidth can change from one host to
another....But preforming the network test upon linking is more
complicated since the client doesn't listen yet to the socket...Maybe we
can neglect this for now and assume the same bandwidth for the new host?
Spice client
============
(A) SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST
client connects to the target, but still stays connected to the
source host. It doesn't listen to the target sockets.
The link message to the target contains the connection_id of the
connection to the source (this allows the target server to identify
itself as a migration target).
For this part we can use most of the code in the class Migrate in
red_client.cpp
(B) SPICE_MSG_MIGRATE
We can use the code in red_channel::handle_migrate to switch the
channels and start listening to the target.
The difference is that we should implement differently the virtual
method RedChannel::on_migrate.
(1) Each channel must reset all the dynamic data that depends on
the server. For example: the display channel
needs to destroy all the surfaces and reset the caches and
dictionary; The playback and record channel need to stop
the current session, if there is an active one, etc.
(2) Each channel should send to the server the initalization
information it normally sends in RedChannel::on_connect.
(C) SPICE_MSG_MAIN_MIGRATE_CANCEL
disconnects all the new channels. This code is already implemented
in spice-client.
spice-protocol(?)/Backward compatibility
=========================================
should we bounce spice protocol version, or use capabilities? (if we
change SPICE_MSG_MAIN_MIGRATE_BEGIN structue, there is no question).
New Spice-Server with old client will send only
SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST, and only when migration completes
(same as today).
New client with old Spice-server will disconnect the source and will
connect the target upon receiving SPICE_MSG_MAIN_MIGRATE_SWITCH_HOST
(same as today).
Preferably, I would introduce SPICE_MSG_MAIN_MIGRATE_BEGIN2 etc. and
deprecate the older messages. From what I understand, we are now
preferably using caps rather than bumping protocol version.
o.k. But when do we actually change the protocol version?
cheers
_______________________________________________
Spice-devel mailing list
Spice-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/spice-devel
--
David Jaša
_______________________________________________
Spice-devel mailing list
Spice-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/spice-devel
_______________________________________________
Spice-devel mailing list
Spice-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/spice-devel