On (Tue) 16 Jun 2015 [11:26:14], Dr. David Alan Gilbert (git) wrote: > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com> > > Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com>
Reviewed-by: Amit Shah <amit.s...@redhat.com> A few minor comments: > --- > docs/migration.txt | 167 > +++++++++++++++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 167 insertions(+) > > diff --git a/docs/migration.txt b/docs/migration.txt > index f6df4be..b4b93d1 100644 > --- a/docs/migration.txt > +++ b/docs/migration.txt > @@ -291,3 +291,170 @@ save/send this state when we are in the middle of a pio > operation > (that is what ide_drive_pio_state_needed() checks). If DRQ_STAT is > not enabled, the values on that fields are garbage and don't need to > be sent. > + > += Return path = > + > +In most migration scenarios there is only a single data path that runs > +from the source VM to the destination, typically along a single fd (although > +possibly with another fd or similar for some fast way of throwing pages > across). > + > +However, some uses need two way communication; in particular the Postcopy > destination > +needs to be able to request pages on demand from the source. > + > +For these scenarios there is a 'return path' from the destination to the > source; > +qemu_file_get_return_path(QEMUFile* fwdpath) gives the QEMUFile* for the > return > +path. > + > + Source side > + Forward path - written by migration thread > + Return path - opened by main thread, read by return-path thread > + > + Destination side > + Forward path - read by main thread > + Return path - opened by main thread, written by main thread AND > postcopy > + thread (protected by rp_mutex) > + > += Postcopy = > +'Postcopy' migration is a way to deal with migrations that refuse to > converge; (or take too long to converge) > +its plus side is that there is an upper bound on the amount of migration > traffic > +and time it takes, the down side is that during the postcopy phase, a > failure of > +*either* side or the network connection causes the guest to be lost. > + > +In postcopy the destination CPUs are started before all the memory has been > +transferred, and accesses to pages that are yet to be transferred cause > +a fault that's translated by QEMU into a request to the source QEMU. > + > +Postcopy can be combined with precopy (i.e. normal migration) so that if > precopy > +doesn't finish in a given time the switch is made to postcopy. > + > +=== Enabling postcopy === > + > +To enable postcopy (prior to the start of migration): How about this instead: "To enable postcopy, issue this command ont he monitor prior to the start of migration:" Otherwise, there's ambiguity that there is some way to enable this after a precopy migration has started. > + > +migrate_set_capability x-postcopy-ram on > + > +The migration will still start in precopy mode, however issuing: "A future migration will then start in precopy mode. However, issuing:" ? > + > +migrate_start_postcopy > + > +will now cause the transition from precopy to postcopy. > +It can be issued immediately after migration is started or any > +time later on. Issuing it after the end of a migration is harmless. > + > +=== Postcopy device transfer === > + > +Loading of device data may cause the device emulation to access guest RAM > +that may trigger faults that have to be resolved by the source, as such > +the migration stream has to be able to respond with page data *during* the > +device load, and hence the device data has to be read from the stream > completely > +before the device load begins to free the stream up. This is achieved by > +'packaging' the device data into a blob that's read in one go. > + > +Source behaviour > + > +Until postcopy is entered the migration stream is identical to normal > +precopy, except for the addition of a 'postcopy advise' command at > +the beginning, to tell the destination that postcopy might happen. > +When postcopy starts the source sends the page discard data and then > +forms the 'package' containing: > + > + Command: 'postcopy listen' > + The device state > + A series of sections, identical to the precopy streams device state > stream > + containing everything except postcopiable devices (i.e. RAM) > + Command: 'postcopy run' > + > +The 'package' is sent as the data part of a Command: 'CMD_PACKAGED', and the > +contents are formatted in the same way as the main migration stream. > + > +Destination behaviour > + > +Initially the destination looks the same as precopy, with a single thread > +reading the migration stream; the 'postcopy advise' and 'discard' commands > +are processed to change the way RAM is managed, but don't affect the stream > +processing. > + > +------------------------------------------------------------------------------ > + 1 2 3 4 5 6 7 > +main -----DISCARD-CMD_PACKAGED ( LISTEN DEVICE DEVICE DEVICE RUN ) > +thread | | > + | (page request) > + | \___ > + v \ > +listen thread: --- page -- page -- page -- page -- page > -- > + > + a b c > +------------------------------------------------------------------------------ > + > +On receipt of CMD_PACKAGED (1) > + All the data associated with the package - the ( ... ) section in the > +diagram - is read into memory (into a QEMUSizedBuffer), and the main thread > +recurses into qemu_loadvm_state_main to process the contents of the package > (2) > +which contains commands (3,6) and devices (4...) > + > +On receipt of 'postcopy listen' - 3 -(i.e. the 1st command in the package) > +a new thread (a) is started that takes over servicing the migration stream, > +while the main thread carries on loading the package. It loads normal > +background page data (b) but if during a device load a fault happens (5) the > +returned page (c) is loaded by the listen thread allowing the main threads > +device load to carry on. > + > +The last thing in the CMD_PACKAGED is a 'RUN' command (6) letting the > destination > +CPUs start running. > +At the end of the CMD_PACKAGED (7) the main thread returns to normal running > behaviour > +and is no longer used by migration, while the listen thread carries > +on servicing page data until the end of migration. > + > +=== Postcopy states === > + > +Postcopy moves through a series of states (see postcopy_state) from > +ADVISE->LISTEN->RUNNING->END > + > + Advise: Set at the start of migration if postcopy is enabled, even > + if it hasn't had the start command; here the destination > + checks that its OS has the support needed for postcopy, and > performs > + setup to ensure the RAM mappings are suitable for later postcopy. > + (Triggered by reception of POSTCOPY_ADVISE command) Adding: "This gives the destination a chance to fail early if postcopy is not possible." ? > + > + Listen: The first command in the package, POSTCOPY_LISTEN, switches > + the destination state to Listen, and starts a new thread > + (the 'listen thread') which takes over the job of receiving > + pages off the migration stream, while the main thread carries > + on processing the blob. With this thread able to process page > + reception, the destination now 'sensitises' the RAM to detect > + any access to missing pages (on Linux using the 'userfault' > + system). > + > + Running: POSTCOPY_RUN causes the destination to synchronise all > + state and start the CPUs and IO devices running. The main > + thread now finishes processing the migration package and > + now carries on as it would for normal precopy migration > + (although it can't do the cleanup it would do as it > + finishes a normal migration). indentation went off a bit Amit