Hi! I'm building a bridge to expose vhost-user devices through VDUSE. The code is still immature but I'm able to forward packets using dpdk-l2fwd through VDUSE to VM. I'm now developing exposing virtiofsd, but I've hit an error I'd like to discuss.
VDUSE devices can get all the memory regions the driver is using by VDUSE_IOTLB_GET_FD ioctl. It returns a file descriptor with a memory region associated that can be mapped with mmap, and an information entry about the map it contains: * Start and end addresses from the driver POV * Offset within the mmaped region of these start and end * Device permissions over that region. [start=0xc3000][last=0xe7fff][offset=0xc3000][perm=1] Now when I try to map it, it is impossible for the userspace device to call mmap with any offset different than 0. So the "straightforward" mmap with size = entry.last-entry.start and offset = entry.offset does not work. I don't know if this is a limitation of Linux or VDUSE. Checking QEMU's subprojects/libvduse/libvduse.c:vduse_iova_add_region() I see it handles the offset by adding it up to the size, instead of using it directly as a parameter in the mmap: void *mmap_addr = mmap(0, size + offset, prot, MAP_SHARED, fd, 0); I can replicate it on the bridge for sure. Now I send the VhostUserMemoryRegion to the vhost-user application. The struct has these members: struct VhostUserMemoryRegion { uint64_t guest_phys_addr; uint64_t memory_size; uint64_t userspace_addr; uint64_t mmap_offset; }; So I can send the offset to the vhost-user device. I can check that dpdk-l2fwd uses the same trick of adding offset to the size of the mapping region [1], at lib/vhost/vhost_user.c:vhost_user_mmap_region(): mmap_size = region->size + mmap_offset; mmap_addr = mmap(NULL, mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED | populate, region->fd, 0); So mmap is called with offset == 0 and everybody is happy. Now I'm moving to virtiofsd, and vm-memory crate in particular. And it performs the mmap without the size += offset trick, at MmapRegionBuilder<B>:build() [2]. I can try to apply the offset + size trick in my bridge but I don't think it is the right solution. At first glance, the right solution is to mmap with the offset as vm-memory crate do. But having libvduse and DPDK apply the same trick sounds to me like it is a known limitation / workaround I don't know about. What is the history of this? Can VDUSE problem (if any) be solved? Am I missing something? Thanks! [1] https://github.com/DPDK/dpdk/blob/e2e546ab5bf5e024986ccb5310ab43982f3bb40c/lib/vhost/vhost_user.c#L1305 [2] https://github.com/rust-vmm/vm-memory/blob/main/src/mmap_unix.rs#L128