Consider the following sequence:
1. Boot fresh VM (say, a boring 1GB vm) => Resident
set is small, say 100M
2. Touch all the memory (with a utility or something) => Resident set is ~1G
3. Send QMP "balloon 500" => Resident set is ~500M
4. Now, migrate the VM => Resident set is 1G again
This suggests to me that migration is not accounting for
what memory was ballooned.
I suspect this is because the migration_bitmap does not coordinate
with the list of ballooned-out memory that was MADVISED().
This affects RDMA as well as TCP on the sender side.
Is there any hard reason why we're not validating migration_bitmap against
the memory that was MADVISED()'d?
- Michael R. Hines