On 11/05/2009 09:58 PM, Anthony Liguori wrote:
Avi Kivity wrote:
Helpers are really bad. On launch, I find the fragile and hard to do
proper error handling with (but that's probably just me). But the
real problem is at runtime, if you have a 16GB guest then you have to
write-protect 4M ptes and then kvm has to tear down or write protect
(not sure which mmu notifier is called) 4M shadow ptes. Once that's
done, the guest will have to fault its way back; that's at least 4M
exits, around 10 seconds worth of cpu time to execute a couple of
syscalls.
If this is such an issue, then it's something that ought to be fixed
in the kernel.
Now here you can say "tremendously difficult" without danger of
exaggeration. I looked at write-protecting the top 512 pgd entries
instead of the 4M pte entries; that's difficult enough. And then you
fault write access back in, you have to figure out that no sharing is
possibly going on underneath so you can grant write access at the
pgd/pud/pmd level instead of the pte level. There's currently nothing
in Linux that can help with this as sharing is tracked at the page level.
For kvm you have to extend this to mmu notifiers; without npt/ept
there's simply no hope (no correspondence between shadow and host page
tables); with them things are a little easier, though still pretty bad,
as memory won't be aligned the same way.
It's only really applicable to hotplug anyway as you wouldn't have
faulted in the memory when initially launching the guest.
If we're doing something for management system we have to consider
hotplug. It would be pretty mean to offer libvirt an easy way to set up
bridging only to have them track down bugs later where the guest freezes
for tens of milliseconds and later slows down after a hotplug, then
rewrite their code not to use the helper.
I know this has been discussed before, but isn't this why there are
things like vfork()?
vfork() doesn't work with threads - it requires that the calling process
be halted until exec() is called.
Instead of doing silly things into qemu, if there is concern about
this, then it should be fixed in Linux properly.
Of course there is concern about it, and you don't have to do anything
silly to qemu to avoid it. Just not call helpers while it's running.
I'd much prefer a small daemon serving taps on a unix-domain socket.
Of course, management should talk to that daemon, not qemu.
I'd rather not have a program running with elevated privileges when it
not needed.
suid helpers are dangerous whenever they are on disk; daemons are
dangerous only when running.
--
Do not meddle in the internals of kernels, for they are subtle and quick to
panic.