Re: [Qemu-devel] [PATCH v4 43/47] Host page!=target page: Cleanup bitmaps

David Gibson Tue, 27 Jan 2015 21:39:50 -0800

On Tue, Jan 27, 2015 at 10:04:31AM +0000, Dr. David Alan Gilbert wrote:
> * David Gibson (da...@gibson.dropbear.id.au) wrote:
> > On Wed, Dec 17, 2014 at 06:21:34PM +0000, Dr. David Alan Gilbert wrote:
> > > * David Gibson (da...@gibson.dropbear.id.au) wrote:
> > > > On Fri, Oct 03, 2014 at 06:47:49PM +0100, Dr. David Alan Gilbert (git) 
> > > > wrote:
> > > > > From: "Dr. David Alan Gilbert" <dgilb...@redhat.com>
> > > > > 
> > > > > Prior to the start of postcopy, ensure that everything that will
> > > > > be transferred later is a whole host-page in size.
> > > > > 
> > > > > This is accomplished by discarding partially transferred host pages
> > > > > and marking any that are partially dirty as fully dirty.
> > > > > 
> > > > > Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com>
> > > > > ---
> > > > >  arch_init.c | 112 
> > > > > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
> > > > >  1 file changed, 111 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/arch_init.c b/arch_init.c
> > > > > index 1fe4fab..aac250c 100644
> > > > > --- a/arch_init.c
> > > > > +++ b/arch_init.c
> > > > > @@ -1024,7 +1024,6 @@ static uint32_t get_32bits_map(unsigned long 
> > > > > *map, int64_t start)
> > > > >   * A helper to put 32 bits into a bit map; trivial for 
> > > > > HOST_LONG_BITS=32
> > > > >   * messier for 64; the bitmaps are actually long's that are 32 or 
> > > > > 64bit
> > > > >   */
> > > > > -__attribute__ (( unused )) /* Until later in patch series */
> > > > >  static void put_32bits_map(unsigned long *map, int64_t start,
> > > > >                             uint32_t v)
> > > > >  {
> > > > > @@ -1153,15 +1152,126 @@ static int 
> > > > > pc_each_ram_discard(MigrationState *ms)
> > > > >  }
> > > > >  
> > > > >  /*
> > > > > + * Utility for the outgoing postcopy code.
> > > > > + *
> > > > > + * Discard any partially sent host-page size chunks, mark any 
> > > > > partially
> > > > > + * dirty host-page size chunks as all dirty.
> > > > > + *
> > > > > + * Returns: 0 on success
> > > > > + */
> > > > > +static int postcopy_chunk_hostpages(MigrationState *ms)
> > > > > +{
> > > > > +    struct RAMBlock *block;
> > > > > +    unsigned int host_bits = sysconf(_SC_PAGESIZE) / 
> > > > > TARGET_PAGE_SIZE;
> > > > > +    uint32_t host_mask;
> > > > > +
> > > > > +    /* Should be a power of 2 */
> > > > > +    assert(host_bits && !(host_bits & (host_bits - 1)));
> > > > > +    /*
> > > > > +     * If the host_bits isn't a division of 32 (the minimum long 
> > > > > size)
> > > > > +     * then the code gets a lot more complex; disallow for now
> > > > > +     * (I'm not aware of a system where it's true anyway)
> > > > > +     */
> > > > > +    assert((32 % host_bits) == 0);
> > > > 
> > > > This assert makes the first one redundant.
> > > 
> > > True I guess, removed the power of 2 check.
> > > 
> > > <snip>
> > > 
> > > > > +/*
> > > > >   * Transmit the set of pages to be discarded after precopy to the 
> > > > > target
> > > > >   * these are pages that have been sent previously but have been 
> > > > > dirtied
> > > > >   * Hopefully this is pretty sparse
> > > > >   */
> > > > >  int ram_postcopy_send_discard_bitmap(MigrationState *ms)
> > > > >  {
> > > > > +    int ret;
> > > > > +
> > > > >      /* This should be our last sync, the src is now paused */
> > > > >      migration_bitmap_sync();
> > > > >  
> > > > > +    /* Deal with TPS != HPS */
> > > > > +    ret = postcopy_chunk_hostpages(ms);
> > > > > +    if (ret) {
> > > > > +        return ret;
> > > > > +    }
> > > > 
> > > > This really seems like a bogus thing to be doing on the outgoing
> > > > migration side.  Doesn't the host page size constraint come from the
> > > > destination (due to the need to atomically instate pages).  Source
> > > > host page size == destination host page size doesn't seem like it
> > > > should be an inherent constraint
> > > 
> > > It's not an inherent constraint; it just makes life messier. I had
> > > some code to deal with it but it complicates things even more, and
> > > I've not got anything to test that rare case with; if someone is
> > > desperate for it then it can be added.
> > 
> > So, I'm all for deferring implementation improvements that we don't
> > need for the time being.
> > 
> > What worries me though, is having the source have to make assumptions
> > about how the migration stream will be processed on the destination
> > that aren't somehow baked into the protocol itself.  i.e. I think we
> > should really try to avoid the possibility of migration streams that
> > are structurally sound, and look like they should be valid, but
> > aren't, because of subtle constraints in the order and manner in which
> > the destination needs to process the individual chunks.
> 
> Agreed; see below.
> 
> > > > and it's not clear why you can't do
> > > > this rounding out to host page sized chunks on the receive end.
> > > 
> > > The source keeps track of which pages still need sending, and so
> > > has to update that list when it tells the destination to perform
> > > a discard.
> > 
> > Ah.
> > 
> > > If the destination discards more than the source told it to (for
> > > example because it has bigger host-pages) the source would need
> > > to update it's map of the pages that still need sending.
> > 
> > I'm beginning to wonder if what we really need is for early in the
> > migration process the destination to tell the host what granularity of
> > updates it can handle (based on its page size).
> > 
> > Perhaps the short summary is that I don't think we need to actually
> > handle the case of different source and dest host page sizes.  BUT,
> > if that does happen the migration process should be able to detect
> > that that's what's gone wrong and print a meaningful error, rather
> > than having the destination blow up part way through and deep the code
> > because chunk constraints necessary for the dest host page size
> > haven't been met by the source.
> 
> Right; I cut the problem in the opposite direction and made the source
> send the destination its page sizes in the 'advise' message and
> the destination validates it and spits a:
>    Postcopy needs matching host page sizes (s=%d d=%d)"
> 
> (That's in 21/47 Add wrappers and handlers....)
> It's just a little easier to do it that way rather than having to
> make the source wait for the destination.


I guess that's ok.  It's still a bit ugly, because it means if we were
ever to support different source and dest host page sizes, it would
necessarily require a protocol change (unless I've missed something).
But at least the errors would be obvious.

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

pgpQhpYzJiNCn.pgp
Description: PGP signature

Re: [Qemu-devel] [PATCH v4 43/47] Host page!=target page: Cleanup bitmaps

Reply via email to