On Thu, Feb 14, 2013 at 02:29:09PM -0500, Michael R. Hines wrote: > Orit (and anthony if you're not busy), > > I forgot to respond to this very important comment: > > On 02/13/2013 03:46 AM, Orit Wasserman wrote: > > Are you still using the tcp for transferring device state? If so you can > call the tcp functions from the migration rdma code as a first step but I > would prefer it to use RDMA too. > > > This is the crux of the problem of using RDMA for migration: Currently all of > the QEMU migration control logic and device state goes through the the > QEMUFile > implementation. RDMA, however is by nature a zero-copy solution and is > incompatible with QEMUFile.
RDMA might be overkill but you could reuse the same connection, using send instructions. > Using RDMA for transferring device state is not recommended: Setuping an RDMA > requires registering the memory locations on both sides with the RDMA > hardware. > This is not ideal because this would require pinning the memory holding the > device state and then issuing the RDMA transfer for *each* type of device - > which would require changing the control path of every type of migrated device > in QEMU. > > Currently the Patch we submitted bypasses QEMUFile. It does just issues the > RDMA transfer for the memory that was dirtied and then continues along with > the > rest of the migration call path normally. > > In an ideal world, we would prefer a hyrbid approach, something like: > > Begin Migration Iteration Round: > 1. stop VCPU > 2. start iterative pass over memory > 3. send control signals (if any) / device state to QEMUFile > 4. When a dirty memory page is found, do: > a) Instruct the QEMUFile to block > b) Issue the RDMA transfer > c) Instruct the QEMUFile to unblock > 5. resume VCPU > > This would require a "smarter" QEMUFile implementation that understands when > to > block and for how long. > > Comments? > > - Michael