On 06/16/2015 04:51 PM, Stefano Stabellini wrote:
> On Tue, 16 Jun 2015, Wei Liu wrote:
>> On Wed, Jun 10, 2015 at 01:55:13PM +0100, George Dunlap wrote:
>>> This reverts commit c1d322e6048796296555dd36fdd102d7fa2f50bf.
>>>
>>> The original commit fixes a bug when assigning a large number of
>>> devices which require option roms to a guest.  (One known
>>> configuration that needs extra memory is having more than 4 emulated
>>> NICs assigned.  Three or fewer NICs seems to work without this
>>> functionality.)
>>>
>>> However, by unilaterally increasing maxmem, it introduces two
>>> problems.
>>>
>>> First, now libxl's calculation of the required maxmem during migration
>>> is broken -- any guest which exercised this functionality will fail on
>>> migration.  (Guests which have the default number of devices are not
>>> affected.)
>>>
>>> Secondly, it makes it impossible for a higher-level toolstack or
>>> administer to predict how much memory a VM will actually use, making
>>> it much more difficult to effectively use all of the memory on a
>>> machine.
>>>
>>> The right solution to the original problem is to figure out a way for
>>> qemu to take pages from the existing pool of guest memory, rather than
>>> allocating more pages.
>>>
>>> That fix will take more time to develop than we have until the feature
>>> freeze.  In the mean time, the simplest way to fix the migration issue
>>> is to revert this change.  That will re-introduce the original bug,
>>> but it's an unusual corner case; and without migration it isn't fully
>>> functional yet anyway.
>>>
>>> Signed-off-by: George Dunlap <george.dun...@eu.citrix.com>
>>> ---
>>> I do think this is the right approach, but I'm mainly sending this is
>>> mainly to open up discussion.
>>>
>>> CC: Stefano Stabellini <stefano.stabell...@citrix.com>
>>> CC: Wei Liu <wei.l...@citrix.com>
>>> CC: Ian Campbell <ian.campb...@citrix.com>
>>> CC: Andrew Cooper <andrew.coop...@citrix.com>
>>
>> Stefano, Andrew, any comments?
>>
>> If we're to do this we need to do it now.
>>
>> I think reverting this change in QEMU and relevant changes in libxl
>> would be the most viable solution to solve this for this release.
> 
> Reverting this patch doesn't really solve the problem: instead of
> breaking on migration when the VM has more than 3 emulated NICs, the VM
> simply refuses to start in that case. I guess it can be considered a
> small improvement but certainly not a fix.
> 
> Given that the migration issue only happens in an "unusual corner case",
> are we really in a hurry to revert this commit and go back to the
> failure to start, even before we actually figure out what the proper fix
> is?

I'm in a hurry to go back to a world where qemu doesn't unexpectedly
allocate more RAM to a guest.  If the real problem with the original
patch was that it broke migration, we could fix that pretty easily; but
the real problem (to me) with the original patch is that it
unpredicatably allocates more memory to a guest that the toolstack
doesn't know about.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
http://lists.xen.org/xen-devel

Reply via email to