This patchset allows QEMU to poll and check the device used buffer after sending all SVQ control commands, instead of polling and checking immediately after sending each SVQ control command, so that QEMU can send all the SVQ control commands in parallel, which have better performance improvement.
I use vdpa_sim_net to simulate vdpa device, refactor vhost_vdpa_net_load() to call vhost_vdpa_net_load_mac() 30 times, to build a test environment for sending multiple SVQ control commands. The monotonic time to finish vhost_vdpa_net_load() is as follows: QEMU microseconds -------------------------------------------------- not patched 85.092 -------------------------------------------------- patched 79.222 So this is a save of (85.092 - 79.222)/30 = 0.2 ms per command. This patchset resolves the GitLab issue at https://gitlab.com/qemu-project/qemu/-/issues/1578. v2: - recover accidentally deleted rows - remove extra newline - refactor `need_poll_len` to `cmds_in_flight` - return -EINVAL when vhost_svq_poll() return 0 or check on buffers written by device fails - change the type of `in_cursor`, and refactor the code for updating cursor - return directly when vhost_vdpa_net_load_{mac,mq}() returns a failure in vhost_vdpa_net_load() v1: https://lists.nongnu.org/archive/html/qemu-devel/2023-04/msg02668.html Hawkins Jiawei (2): vdpa: rename vhost_vdpa_net_cvq_add() vdpa: send CVQ state load commands in parallel net/vhost-vdpa.c | 165 +++++++++++++++++++++++++++++++++++++---------- 1 file changed, 130 insertions(+), 35 deletions(-) -- 2.25.1