Hi On Mon, Jun 4, 2018 at 11:37 AM, Gerd Hoffmann <kra...@redhat.com> wrote: > On Fri, Jun 01, 2018 at 06:27:48PM +0200, Marc-André Lureau wrote: >> Add to virtio-gpu devices a "vhost-user" property. When set, the >> associated vhost-user backend is used to handle the virtio rings. >> >> For now, a socketpair is created for the backend to share the rendering >> results with qemu via a simple VHOST_GPU protocol. > > Why this isn't a separate device, like vhost-user-input-pci?
Ok, let's have vhost-user-gpu-pci and vhost-user-vga, inheriting from existing devices. >> +typedef struct VhostGpuUpdate { >> + uint32_t scanout_id; >> + uint32_t x; >> + uint32_t y; >> + uint32_t width; >> + uint32_t height; >> + uint8_t data[]; >> +} QEMU_PACKED VhostGpuUpdate; > > Hmm, when designing a new protocol I think we can do better than just > squeering the pixels into a tcp stream. Use shared memory instead? Due > to vhost we are limited to linux anyway, so we might even consider stuff > like dmabufs here. Well, my goal is not to invent a new spice or wayland protocol :) I don't care much about 2d performance at this point, more about 3d. Can we leave 2d improvements for another day? Beside, what would dmabuf bring us for 2d compared to shmem? There seems to be a lot of overhead with the roundtrip vhost-user -> qemu -> spice worker -> spice client -> wayland/x11 -> gpu already (but this isn't necessarily so bad at 60fps or less). Ideally, I would like to bypass qemu & spice for local rendering, but I don't think wayland support that kind of nested window composition (at least tracking messages weston --nested doesn't show that kind of optimization). FWIW, here are some Unigine Heaven 4.0 benchmarks (probably within +-10%): qemu-gtk/egl+virtio-gpu: fps:2.6/ score: 64 qemu-gtk/egl+vhost-user-gpu: fps:12.9 / score: 329 spice+virtio-gpu: fps:2.8 / score: 70 spice+vhost-user-gpu: fps:12.1 / score: 304 That should give an extra motivation :) -- Marc-André Lureau