On 11/06/2009 06:09 PM, Anthony Liguori wrote:
No, it's an argument against fork() of large programs.
After putting together a work around, I'm starting to have my doubts
about how real of a problem this is.
You're only write protecting memory, correct?
The kernel write protects qemu memory, and kvm write protects shadow
page table entries. Live migration only does the second.
So it's equivalent to enabling dirty tracking during live migration.
In my mind, if the cost associated with hot plug is a fraction of the
cost of live migration, we're in good shape.
I don't see why. Live migration is pretty expensive, but that doesn't
mean NIC hotplug should be. Deployments where live migration isn't a
concern (for example, performance critical guests that use device
assignment) don't suffer the live migration penalty so they shouldn't
expect to see a gratuitous NIC hotplug penalty that is a fraction of that.
Come to think of that, we probably have some fork() breakage with device
assignment since we can't write protect pages assigned to the iommu.
It's not likely that a 16GB guest is going to write-fault in it's
entirely memory range immediately. In fact, it's likely to be
amortized over a very long period of time so I have a hard time
believing this is really an issue in practice.
It depends on the workload. With large pages in both host and guest you
can touch 10M pages/sec without difficulty. Once you write protect them
this drops to maybe 0.3M pages/sec. The right workload will suffer
pretty badly from this.
Arguably, it's a much bigger problem for live migration.
It is. I once considered switching live migration to shadow pte dirty
bit tracking instead of write protection, but ept doesn't have dirty
bits, so this will only help a minority of deployments by the time it
reaches users.
So vfork() is required, or in light of its man page and glowing
recommendations from the security people, we can mark guest memory as
shared on fork and use plain fork(), like we do for pre mmu notifier
kernels.
--
Do not meddle in the internals of kernels, for they are subtle and quick to
panic.