On 14.04.2016 13:47, Dr. David Alan Gilbert wrote:
> * Thomas Huth (th...@redhat.com) wrote:
> 
>> That would mean a regression compared to what we have today. Currently,
>> the ballooning is working OK for 64k guests on a 64k ppc host - rather
>> by chance than on purpose, but it's working. The guest is always sending
>> all the 4k fragments of a 64k page, and QEMU is trying to call madvise()
>> for every one of them, but the kernel is ignoring madvise() on
>> non-64k-aligned addresses, so we end up with a situation where the
>> madvise() frees a whole 64k page which is also declared as free by the
>> guest.
> 
> I wouldn't worry about migrating your fragmenet map; but I wonder if it
> needs to be that complex - does the guest normally do something more sane
> like do the 4k pages in order and so you've just got to track the last
> page it tried rather than having a full map?

That's maybe a little bit easier and might work for well-known Linux
guests, but IMHO it's even more a hack than my approach: If the Linux
driver one day is switched to send the pages in the opposite order, or
if somebody tries to run a non-wellknown (i.e. non-Linux) guest, this
does not work at all anymore.

> A side question is whether the behaviour that's seen by 
> virtio_ballon_handle_output
> is always actually the full 64k page;  it calls balloon_page once
> for each message/element - but if all of those elements add back up to the 
> full
> page, perhaps it makes more sense to reassemble it there?

That might work for 64k page size guests ... but for 4k guests, I think
you'll have a hard time to reassemble a page there more easily than with
my current approach. Or do you have a clever algorithm in mind that
could do the job well there?

 Thomas


Reply via email to