Re: [Qemu-devel] [PATCH v3 1/5] VFIO KABI for migration interface

Cornelia Huck Fri, 01 Mar 2019 05:38:16 -0800

On Wed, 27 Feb 2019 01:35:01 +0530
Kirti Wankhede <kwankh...@nvidia.com> wrote:


> Alex,
> 
> On 2/22/2019 3:53 AM, Alex Williamson wrote:
> > On Wed, 20 Feb 2019 02:53:16 +0530
> > Kirti Wankhede <kwankh...@nvidia.com> wrote:
> >   
> >> - Defined MIGRATION region type and sub-type.
> >> - Used 2 bits to define VFIO device states.
> >>     Bit 0 => 0/1 => _STOPPED/_RUNNING
> >>     Bit 1 => 0/1 => _RESUMING/_SAVING
> >>     Combination of these bits defines VFIO device's state during migration
> >>     _RUNNING => Normal VFIO device running state.
> >>     _STOPPED => VFIO device stopped.
> >>     _SAVING | _RUNNING => vCPUs are running, VFIO device is running but 
> >> start
> >>                           saving state of device i.e. pre-copy state
> >>     _SAVING | _STOPPED => vCPUs are stoppped, VFIO device should be 
> >> stopped, and
> >>                           save device state,i.e. stop-n-copy state
> >>     _RESUMING => VFIO device resuming state.  
> > 
> > Shouldn't we have a non-_RESUMING/_SAVING run state?  If these are
> > indicating directly flow, maybe we need two bits:
> > 
> >   00b - None, normal runtime
> >   01b - Saving
> >   10b - Resuming
> >   11b - Invalid/reserved (maybe a Failed state indicator)
> >   
> 
> There has to be 2 more states:
> _SAVING | _RUNNING => SAVING while device is RUNNING - pre-copy phase
> _SAVING | _STOPPED => SAVING while device is STOPPED - stop-and-copy phase
> 
> So the 2 bits used in this patch are:
> 00b - _RESUMING
> 01b - _RUNNING - Normal Running
> 10b - _SAVING | _STOPPED - stop-and-copy phase
> 11b - _SAVING | _RUNNING - pre-copy phase

Not sure if the "use two bits" approach is the most elegant here --
rightmost bit unset can mean either _RESUMING or _STOPPED...

What about simply making this four distinct states?

#define _RESUMING 0
#define _RUNNING 1 //or the other way around?
#define _SAVING_STOPPED 2
#define _SAVING_RUNNING 3

You could still check if you're currently SAVING by ANDing with 10b.

(...)

> >> + *
> >> + * data_offset: (read/write)
> >> + *      User application should read data_offset in migration region from 
> >> where
> >> + *      user application should read data during _SAVING state.
> >> + *      User application would write data_offset in migration region from 
> >> where
> >> + *      user application is had written data during _RESUMING state.  
> > 
> > Why wouldn't data_offset simply be fixed?  Do we really want to support
> > the user writing a arbitrary offsets in the migration region?  Each
> > chunk can simply start at the kernel provided data_offset.
> >  
> 
> data_offset differs based on region is trapped on mapped. In case when
> region is mmaped, QEMU writes data to mapped region and then write
> data_offset field, indicating data is present in mmaped buffer. Read
> below for more detailed steps.
> 
> >> + *
> >> + * data_size: (write only)
> >> + *      User application should write size of data copied in migration 
> >> region
> >> + *      during _RESUMING state.  
> > 
> > How much does the user read on _SAVING then?
> >   
> 
> pending_bytes tells bytes that should be read on _SAVING.
> 
> >> + *
> >> + * start_pfn: (write only)
> >> + *      Start address pfn to get bitmap of dirty pages from vendor driver 
> >> duing
> >> + *      _SAVING state.
> >> + *
> >> + * page_size: (write only)
> >> + *      User application should write the page_size of pfn.
> >> + *
> >> + * total_pfns: (write only)
> >> + *      Total pfn count from start_pfn for which dirty bitmap is 
> >> requested.  
> > 
> > So effectively the area that begins at data_offset is dual purpose
> > (triple purpose vs Yan's, config, memory, and dirty bitmap) 

I'm wondering whether splitting it would combine the best of the two
approaches: Just having an optional dirty bitmap region would make it
more obvious if dirty page tracking is not available.

Re: [Qemu-devel] [PATCH v3 1/5] VFIO KABI for migration interface

Reply via email to