On 4/25/23 13:03, Parav Pandit wrote:
[...]
I briefly see your rdma command descriptor example, which is not aligned
to 16B. Perf wise it will be poor than nvme rdma fabrics.
Hi,
I'm confused here, could you please give me more hint?
1, The size of command descriptor(I defined in example) is larger than
command size of nvme rdma, more overhead leads performance worse than
nvme over rdma.
2, The command size not aligned to 16B leads performance issue on RDMA
SEND operation. My colleague Zhuo help me test the performance on
sending 16/24/32 bytes:
taskset -c 30 ib_send_bw -d mlx5_2 -i 1 -x 3 -s 16 -t 1 xx.xx.xx.xx
taskset -c 30 ib_send_bw -d mlx5_2 -i 1 -x 3 -s 24 -t 1 xx.xx.xx.xx
taskset -c 30 ib_send_bw -d mlx5_2 -i 1 -x 3 -s 32 -t 1 xx.xx.xx.xx
The QPS seems almost same.
For PCI transport for net, we intent to start the work to improve
descriptors, the transport binding for net device. From our research I
see that some abstract virtio descriptors are great today, but if you
want to get best out of the system (sw, hw, cpu), such abstraction is
not the best. Sharing of "id" all the way to target and bring back is an
example of such inefficiency in your example.
--
zhenwei pi
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]