vhostfd options

Anthony Liguori Sun, 28 Feb 2010 12:59:12 -0800

On 02/28/2010 11:19 AM, Michael S. Tsirkin wrote:

Both have  security implications so I think it's important that they
be addressed.   Otherwise, I'm pretty happy with how things are.

Care suggesting some solutions?

The obvious thing to do would be to use the memory notifier in vhost tokeep track of whenever something remaps the ring's memory region and ifthat happens, issue an ioctl to vhost to change the location of thering. Also, you would need to merge the vhost slot management code withthe KVM slot management code.

I'm sympathetic to your arguments though. As qemu is today, the aboveis definitely the right thing to do. But ram is always ram and ramalways has a fixed (albeit non-linear) mapping within a guest. We canprobably be smarter in qemu.

There are areas of MMIO/ROM address space that *sometimes* end upbehaving like ram, but that's a different case. The one other case toconsider is ram hot add/remove in which case, ram may be removed oradded (but it's location will never change during its lifetime).

Here's what I'll propose, and I'd really like to hear what Paul thinkabout it before we start down this path.


I think we should add a new API that's:

void cpu_ram_add(target_phys_addr_t start, ram_addr_t size);

This API would do two things. It would call qemu_ram_alloc() andcpu_register_physical_memory() just as code does today. It would alsoadd this region into a new table.


There would be:

void *cpu_ram_map(target_phys_addr_t start, ram_addr_t *size);
void cpu_ram_unmap(void *mem);

These calls would use this new table to lookup ram addresses. Thesemappings are valid as long as the guest is executed. Within the table,each region would have a reference count. When it comes time to do hotadd/remove, we would wait to remove a region until the reference countwent to zero to avoid unmapping during DMA.

cpu_ram_add() never gets called with overlapping regions. We'll modifycpu_register_physical_memory() to ensure that a ram mapping is neverchanged after initial registration.

vhost no longer needs to bother keeping the dynamic table up to date soit removes all of the slot management code from vhost. KVM still needsthe code to handle rom/ram mappings but we can take care of that next.virtio-net's userspace code can do the same thing as vhost and only mapthe ring once which should be a big performance improvement.


It also introduces a place to do madvise() reset registrations.

This is definitely appropriate for target-i386. I suspect it is forother architectures too.


Regards,

Anthony Liguori

Furthermore, vhost reduces a virtual machine's security.  It offers an
impressive performance boost (particularly when dealing with 10gbit+
networking) but for a user that doesn't have such strong networking
performance requirements, I think it's reasonable for them to not want
to make a security trade off.

It's hard for me to see how it reduces VM security. If it does, it's
not by design and will be fixed.

If you have a bug in vhost-net (would never happen of course) then it's
a host-kernel exploit whereas if we have a bug in virtio-net userspace,
it's a local user exploit.  We have a pretty robust architecture to deal
with local user exploits (qemu can run unprivilieged, SELinux enforces
mandatory access control) but a host-kernel can not be protected against.

I'm not saying that we should never put things in the kernel, but
there's definitely a security vs. performance trade off here.

Regards,

Anthony Liguori

Not sure I get the argument completely. Any kernel service with a bug
might be exploited for priveledge escalation. Yes, more kernel code
gives you more attack surface, but given we use rich interfaces such as
ones exposed by kvm, I am not sure by how much.

Also note that vhost net does not take qemu out of the equation for
everything, just for datapath operations.

[Qemu-devel] Re: [PATCHv2 10/12] tap: add vhost/vhostfd options

Reply via email to