在 2023/4/25 13:03, Parav Pandit 写道:
On 4/24/2023 9:38 AM, zhenwei pi wrote:
From the point of my view, there are 3 cases:
1, Host/container scenario. For example, host kernel connects a
virtio target block service, maps it as a vdx(virtio-blk) device(used
by Map-Reduce service which needs a fast/large size disk). The host
kernel also connects a virtio target crypto service, maps it as
virtio crypto device(used by nginx to accelarate HTTPS). And so on.
+----------+ +----------+ +----------+
|Map-Reduce| | nginx | ... | processes|
+----------+ +----------+ +----------+
------------------------------------------------------------
Host | | |
Kernel +-------+ +-------+ +-------+
| ext4 | | LKCF | | HWRNG |
+-------+ +-------+ +-------+
| | |
+-------+ +-------+ +-------+
| vdx | |vCrypto| | vRNG |
+-------+ +-------+ +-------+
| | |
| +--------+ |
+---------->|TCP/RDMA|<------------+
+--------+
|
+------+
|NIC/IB|
+------+
| +-------------+
+--------------------->|virtio target|
+-------------+
2, Typical virtualization environment. The workloads run in a guest,
and QEMU handles virtio-pci(or MMIO), and forwards requests to target.
+----------+ +----------+ +----------+
|Map-Reduce| | nginx | ... | processes|
+----------+ +----------+ +----------+
------------------------------------------------------------
Guest | | |
Kernel +-------+ +-------+ +-------+
| ext4 | | LKCF | | HWRNG |
+-------+ +-------+ +-------+
| | |
+-------+ +-------+ +-------+
| vdx | |vCrypto| | vRNG |
+-------+ +-------+ +-------+
| | |
PCI --------------------------------------------------------
|
QEMU +--------------+
|virtio backend|
+--------------+
|
+------+
|NIC/IB|
+------+
| +-------------+
+--------------------->|virtio target|
+-------------+
Example #3 enables to implement virtio backend over fabrics initiator
in the user space, which is also a good use case.
It can be also be done in non native virtio backend.
More below.
3, SmartNIC/DPU/vDPA environment. It's possible to convert virtio-pci
request to virtio-of request by hardware, and forward request to
virtio target directly.
+----------+ +----------+ +----------+
|Map-Reduce| | nginx | ... | processes|
+----------+ +----------+ +----------+
------------------------------------------------------------
Host | | |
Kernel +-------+ +-------+ +-------+
| ext4 | | LKCF | | HWRNG |
+-------+ +-------+ +-------+
| | |
+-------+ +-------+ +-------+
| vdx | |vCrypto| | vRNG |
+-------+ +-------+ +-------+
| | |
PCI --------------------------------------------------------
|
SmartNIC +---------------+
|virtio HW queue|
+---------------+
|
+------+
|NIC/IB|
+------+
| +-------------+
+--------------------->|virtio target|
+-------------+
All 3 seems a valid use cases.
Use case 1 and 2 can be achieved directly without involving any
mediation layer or any other translation layer (for example virtio to
nfs).
Not for at least use case 2? It said it has a virtio backend in Qemu. Or
the only possible way is to have virtio of in the guest.
Thanks
Many blk and file protocols outside of the virtio already exists which
achieve this. I don't see virtio being any different to support this
in native manner, mainly the blk, fs, crypto device.
use case #3 brings additional a benefits at the same time different
complexity but sure #3 is also a valid and common use case in our
experiences.
In my experience working with FC, iSCSI, FCoE, NVMe RDMA fabrics, iSER,
A virito fabrics needs a lot of work to reach the scale, resiliency
and lastly the security. (step by step...)
My humble suggestion is : pick one transport instead of all at once,
rdma being most performant probably the first candidate to see the
perf gain for use case #1 and #2 from remote system.
I briefly see your rdma command descriptor example, which is not
aligned to 16B. Perf wise it will be poor than nvme rdma fabrics.
For PCI transport for net, we intent to start the work to improve
descriptors, the transport binding for net device. From our research I
see that some abstract virtio descriptors are great today, but if you
want to get best out of the system (sw, hw, cpu), such abstraction is
not the best. Sharing of "id" all the way to target and bring back is
an example of such inefficiency in your example.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]