On Mon, Dec 11, 2017 at 01:53:40PM +0000, Wang, Wei W wrote: > On Monday, December 11, 2017 7:12 PM, Stefan Hajnoczi wrote: > > On Sat, Dec 09, 2017 at 04:23:17PM +0000, Wang, Wei W wrote: > > > On Friday, December 8, 2017 4:34 PM, Stefan Hajnoczi wrote: > > > > On Fri, Dec 8, 2017 at 6:43 AM, Wei Wang <wei.w.w...@intel.com> > > wrote: > > > > > On 12/08/2017 07:54 AM, Michael S. Tsirkin wrote: > > > > >> > > > > >> On Thu, Dec 07, 2017 at 06:28:19PM +0000, Stefan Hajnoczi wrote: > > > > >>> > > > > >>> On Thu, Dec 7, 2017 at 5:38 PM, Michael S. Tsirkin > > > > >>> <m...@redhat.com> > > > > > Thanks Stefan and Michael for the sharing and discussion. I > > > > > think above 3 and 4 are debatable (e.g. whether it is simpler > > > > > really depends). 1 and 2 are implementations, I think both > > > > > approaches could implement the device that way. We originally > > > > > thought about one device and driver to support all types (called > > > > > it transformer sometimes :-) ), that would look interesting from > > > > > research point of view, but from real usage point of view, I > > > > > think it would be better to have them separated, > > > > because: > > > > > - different device types have different driver logic, mixing > > > > > them together would cause the driver to look messy. Imagine that > > > > > a networking driver developer has to go over the block related > > > > > code to debug, that also increases the difficulty. > > > > > > > > I'm not sure I understand where things get messy because: > > > > 1. The vhost-pci device implementation in QEMU relays messages but > > > > has no device logic, so device-specific messages like > > > > VHOST_USER_NET_SET_MTU are trivial at this layer. > > > > 2. vhost-user slaves only handle certain vhost-user protocol messages. > > > > They handle device-specific messages for their device type only. > > > > This is like vhost drivers today where the ioctl() function > > > > returns an error if the ioctl is not supported by the device. It's not > > > > messy. > > > > > > > > Where are you worried about messy driver logic? > > > > > > Probably I didn’t explain well, please let me summarize my thought a > > > little > > bit, from the perspective of the control path and data path. > > > > > > Control path: the vhost-user messages - I would prefer just have the > > > interaction between QEMUs, instead of relaying to the GuestSlave, > > > because > > > 1) I think the claimed advantage (easier to debug and develop) > > > doesn’t seem very convincing > > > > You are defining a mapping from the vhost-user protocol to a custom > > virtio device interface. Every time the vhost-user protocol (feature > > bits, messages, > > etc) is extended it will be necessary to map this new extension to the > > virtio device interface. > > > > That's non-trivial. Mistakes are possible when designing the mapping. > > Using the vhost-user protocol as the device interface minimizes the > > effort and risk of mistakes because most messages are relayed 1:1. > > > > > 2) some messages can be directly answered by QemuSlave , and some > > messages are not useful to give to the GuestSlave (inside the VM), > > e.g. fds, VhostUserMemoryRegion from SET_MEM_TABLE msg (the device > > first maps the master memory and gives the offset (in terms of the > > bar, i.e., where does it sit in the bar) of the mapped gpa to the > > guest. if we give the raw VhostUserMemoryRegion to the guest, that wouldn’t > > be usable). > > > > I agree that QEMU has to handle some of messages, but it should still > > relay all (possibly modified) messages to the guest. > > > > The point of using the vhost-user protocol is not just to use a > > familiar binary encoding, it's to match the semantics of vhost-user > > 100%. That way the vhost-user software stack can work either in host > > userspace or with vhost-pci without significant changes. > > > > Using the vhost-user protocol as the device interface doesn't seem any > > harder than defining a completely new virtio device interface. It has > > the advantages that I've pointed out: > > > > 1. Simple 1:1 mapping for most that is easy to maintain as the > > vhost-user protocol grows. > > > > 2. Compatible with vhost-user so slaves can run in host userspace > > or the guest. > > > > I don't see why it makes sense to define new device interfaces for > > each device type and create a software stack that is incompatible with > > vhost-user. > > > I think this 1:1 mapping wouldn't be easy: > > 1) We will have 2 Qemu side slaves to achieve this bidirectional relaying, > that is, the working model will be > - master to slave: Master->QemuSlave1->GuestSlave; and > - slave to master: GuestSlave->QemuSlave2->Master > QemuSlave1 and QemuSlave2 can't be the same piece of code, because QemuSlave1 > needs to do some setup with some messages, and QemuSlave2 is more likely to > be a true "relayer" (receive and directly pass on)
I mostly agree with this. Some messages cannot be passed through. QEMU needs to process some messages so that makes it both a slave (on the host) and a master (to the guest). > 2) poor re-usability of the QemuSlave and GuestSlave > We couldn’t reuse much of the QemuSlave handling code for GuestSlave. > For example, for the VHOST_USER_SET_MEM_TABLE msg, all the QemuSlave handling > code (please see the vp_slave_set_mem_table function), won't be used by > GuestSlave. On the other hand, GuestSlave needs an implementation to reply > back to the QEMU device, and this implementation isn't needed by QemuSlave. > If we want to run the same piece of the slave code in both QEMU and guest, > then we may need "if (QemuSlave) else" in each msg handling entry to choose > the code path for QemuSlave and GuestSlave separately. > So, ideally we wish to run (reuse) one slave implementation in both QEMU and > guest. In practice, we will still need to handle them each case by case, > which is no different than maintaining two separate slaves for QEMU and > guest, and I'm afraid this would be much more complex. Are you saying QEMU's vhost-pci code cannot be reused by guest slaves? If so, I agree and it was not my intention to run the same slave code in QEMU and the guest. When I referred to reusing the vhost-user software stack I meant something else: 1. contrib/libvhost-user/ is a vhost-user slave library. QEMU itself does not use it but external programs may use it to avoid reimplementing vhost-user and vrings. Currently this code handles the vhost-user protocol over UNIX domain sockets, but it's possible to add vfio vhost-pci support. Programs using libvhost-user would be able to take advantage of vhost-pci easily (no big changes required). 2. DPDK and other codebases that implement custom vhost-user slaves are also easy to update for vhost-pci since the same protocol is used. Only the lowest layer of vhost-user slave code needs to be touched. Stefan
signature.asc
Description: PGP signature