On Sun, Jun 10, 2012 at 04:32:54PM +0200, Hans de Goede wrote: > Hi, > > On 06/10/2012 11:05 AM, Yonit Halperin wrote: > >Hi, > > > >As the qemu team rejected integrating spice connection migration in qemu > >migration process, we remain with a solution that will involve libvirt, and > >passing data from the src to the target via the client. Before I continue > >with the implementation I'd like to hear your comments on the details: > > > >Here is a reminder about the problems we face: > >(1) Loss of data: we would like the client to continue the connection from > >the same point the vm was stopped. For example, we want any usb/smartcard > >devices to stay attached, and we don't want to lose any data that was sent > >from the client to the vm, or partial data that was read from a device, but > >hasn't reached its destination before migration. > > > >(2) The qemu process in the src side can be closed by libvirt as soon as the > >migration state changes to "completed". Thus, we can't reliably pass any > >data between the src server and the client after migration has completed. > > > >These problems can be addressed by the following: > >Add a qmp event for spice migration completion. libvirt will need to wait > >not only for qemu migration completion, but also for this qmp event, before > >it closes the src qemu. > >Spice is required to know whether libvirt supports this, or not, in order to > >decide which migration approach to take (semi or seamless). For this aim, we > >will add a new parameter to the spice configuration in the qemu command line > >(e.g., seamless-migration=on), and if it is set by libvirt we can assume > >libvirt will wait for spice migration. > >After qemu migration is completed, the src server will pass migration data > >to the target via the client/s. When the clients disconnect from the src and > >switch completely to the target, we send the new qmp event. > > > > > >migration data transfer > >======================= > >Our historical MSG_MIGRATE pathway, provides support for sending all pending > >outgoing data from the client to the server, and vice-versa, before we fill > >the migration data. > >Each channel defines its own migration data. > >(1) MSG_MIGRATE is the last message that is sent from the src server channel > >to the client, before MIGRATE_DATA. > >(2) If the messages flags have MIGRATE_NEED_FLUSH, the client write all its > >outgoing data, and then sends FLUSH to the server. (3) Then the client > >channel waits for MIGRATE_DATA message, and does nothing besides that. (4) > >When it receives the message, it switches to the target completely and > >passes it the migration data. > > > >(1) server channel--->MSG_MIGRATE...in-flight messages--->client > >(2) client channel-->MSGC_FLUSH_MARK...in-flight messages-->server > >(3) server channel-->MSG_MIGRATE_DATA-->client > >(4) client channel-->MSGC_MIGRATE_DATA-->target server > > > >Obligatory migration data: > >------------------------- > >(1) agent/spicevmc/smartcard write buffer. i.e., data that reached the > >server after savevm, and thus was not written to the device. > >Currently, spicevmc and smartcard do not have write buffer, but since > >buffers can reach the server after savevm, they should have one. I'm not > >sure if even today they should attempt to write to the guest if it is > >stopped. The agent code also can write to the guest even if it is stopped; I > >think it is a bug. > >(2) agent/smartcard partial data that had been read from the device and > >wasn't sent to the client since its reading hasn't completed. > >Currently we don't have such data for spicevmc, because we push to the > >client any amount of data we read. In the future we might want to control > >the rate and the size of data we send/receive, and then we will have > >outgoing buffer. > > I'm still not a big fan of the concept of server data going through the > client, this means the server > will need to seriously sanity check what it receives to avoid potentially new > attacks on it. > > I'm wondering why not do the following: > > 1) spicevmc device gets a savevm call, tell spice-server to send a message to > the client telling it > to stop sending more data to *this* server. > 2) client sends an ack in response to the stop sending data server > 3) server waits for ack. > 4) savevm continues only after ack, which means all data which was in flight > has been received.
Have you seen the qemu-devel thread Yonit referred to in the beginning? Let me quote: " Spice is *not* getting a hook in migration where it gets to add arbitrary amounts of downtime to the migration traffic. That's a terrible idea. " It didn't continue any better. http://lists.nongnu.org/archive/html/qemu-devel/2012-06/msg00559.html > > No more reason for obligatory data 1. > And as you already point out 2, is not an issue atm. > > So no more obligatory reason to have server *state* pass through the client, > which I still believe > is just asking for security vulnerabilities. > > > >Optional migration data: > >-------------- > >- primary surface lossy region(*), or its extents > >If we don't send it to the client, and jpeg is enabled, we will need to > >resend the primary surface after migration, or set the lossy region to the > >whole surface, and then each non opaque rendering operation that involves > >the surface, will require resending parts of it losslessly. > > So this needs to be send to client, but not back to the server? > > >- list of off-screen surfaces ids that have been sent to the client, and > >their lossy region. > >By keeping this data we will avoid on-demand resending surfaces that > >already exist on the client side. > > The client already knows which off-screen surfaces ids it has been received, > so it can just > send these to the new server without having to receive them from the old one > first. > > >- bitmaps cache - list of bitmaps ids + some internal cache information for > >each bitmap. > > idem. > > >- active video streams: ids, destination box, etc. > > idem. > > >- session bandwidth (low/high): we don't want to perform the main channel > >net test after the migration is completed, because it can take time (we > >can't do it during the migration because the main loop is not available). So > >we assume the bandwidth classification will stay the same. When we will have > >a dynamic monitoring of bandwidth, we can drop this. > > This I can live with being send through the client, but then not as opaque > data, but have > a special command for it. This could be useful in non migration cases too. If > the client > somehow already knows the channel characteristics. > > > > > >Though the above data is optional, part of it is important for avoiding a > >slow start of the connection to target (e.g., sending the primary lossy > >region, in order to avoid resending parts of it). > > > >In addition, if we wish to keep the client channels state the same, and not > >require them (1) to send initialization data to the server, and (2) to reset > >part of their state, we should also migrate other server state details, like: > >- the serial of the last message sent from the display channel > >- main channel agent data tokens state > >- size of the images cache (this is usually set by the client upon new > >connection). > >Including such information in the migration data will allow us to keep the > >migration logic in the server. The alternative will be that the client will > >reset part of its state after migration, either by self initiative, or by > >specific messages sent from the server (it may require new set of messages). > > > >(*) lossy-region=the region on the surface that contains bitmaps that were > >compressed using jpeg > > > >Transparency of migration data: > >------------------------------ > >I think that the migration data shouldn't be part of spice protocol, and > >that it should be opaque to the client, for the following reasons: > > As said before, I think that migration data should not be send through the > spice protocol *at all* ! > > >(a) The client is only a mediator, and it has nothing to do with the data > >content. > >(b) If the migration data of each channel is part of spice protocol, every > >minor change to the migration data of one channel, will require a new > >message and capability, and will make the support in migration backward > >compatibility more cumbersome, as it will involve the client as well. > >Moreover, If the client supports only migration data of ver x, and the src > >and target both support ver x+1, we will suffer from data loss. > >(c) As for security issues, I don't think that it should raise a problem > >since the client is trusted by both the src and the target. > > The client is trusted to access the *vm*, not the *host*, and this allows > attacks on spice-server, > which is running on the *host*. > > > Regards, > > Hans _______________________________________________ Spice-devel mailing list Spice-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/spice-devel