On 14.04.2016 13:47, Dr. David Alan Gilbert wrote: > * Thomas Huth (th...@redhat.com) wrote: > >> That would mean a regression compared to what we have today. Currently, >> the ballooning is working OK for 64k guests on a 64k ppc host - rather >> by chance than on purpose, but it's working. The guest is always sending >> all the 4k fragments of a 64k page, and QEMU is trying to call madvise() >> for every one of them, but the kernel is ignoring madvise() on >> non-64k-aligned addresses, so we end up with a situation where the >> madvise() frees a whole 64k page which is also declared as free by the >> guest. > > I wouldn't worry about migrating your fragmenet map; but I wonder if it > needs to be that complex - does the guest normally do something more sane > like do the 4k pages in order and so you've just got to track the last > page it tried rather than having a full map?
That's maybe a little bit easier and might work for well-known Linux guests, but IMHO it's even more a hack than my approach: If the Linux driver one day is switched to send the pages in the opposite order, or if somebody tries to run a non-wellknown (i.e. non-Linux) guest, this does not work at all anymore. > A side question is whether the behaviour that's seen by > virtio_ballon_handle_output > is always actually the full 64k page; it calls balloon_page once > for each message/element - but if all of those elements add back up to the > full > page, perhaps it makes more sense to reassemble it there? That might work for 64k page size guests ... but for 4k guests, I think you'll have a hard time to reassemble a page there more easily than with my current approach. Or do you have a clever algorithm in mind that could do the job well there? Thomas