On Thu, May 19, 2016 at 04:20:40PM +0000, Yoni Gilad wrote: > Hi, > > We have encountered a crash in virtio_xmit_pkts (specifically, in the call to > virtqueue_notify) when running DPDK in a multi-process setup. This is a > regression in DPDK 16.04. > > The culprit seems to be the field vtpci_ops in the virtio_hw structure. This > field is stored in shared memory, but points to a struct in the primary > process's address space. If the same struct was loaded in a different address > in the secondary process, it will lead to a crash or other issues when this > field is dereferenced there. The referenced virtio_pci_ops struct contains > function pointers, which can also be different in the secondary process.
That indeed sounds like to be the culprit. Function pointers is known for not friendly for multiple processes: see the 18.c section of DPDK programmers guide (http://dpdk.org/doc/guides/prog_guide/multi_proc_support.html): The use of function pointers between multiple processes running based of different compiled binaries is not supported, since the location of a given function in one process may be different to its location in a second. This prevents the librte_hash library from behaving properly as in a multi-threaded instance, since it uses a pointer to the hash function internally. TBH, I missed this bit (multiple processes) while introducing this function pointer; well, we never tested it before, either. We could fix/workaround it by getting the right function pointer set dynamically, but that far from being perfect. --yliu