> -----Original Message-----
> From: Maxime Coquelin <maxime.coque...@redhat.com>
> Sent: Wednesday, October 26, 2022 12:05 AM
> To: Richardson, Bruce <bruce.richard...@intel.com>
> Cc: Hu, Jiayu <jiayu...@intel.com>; Wang, YuanX <yuanx.w...@intel.com>;
> Xia, Chenbo <chenbo....@intel.com>; dev@dpdk.org; Jiang, Cheng1
> <cheng1.ji...@intel.com>; Ma, WenwuX <wenwux...@intel.com>; He,
> Xingguang <xingguang...@intel.com>; Thomas Monjalon
> <tho...@monjalon.net>; Ilya Maximets <imaxi...@redhat.com>; David
> Marchand <david.march...@redhat.com>
> Subject: Re: [PATCH v5] net/vhost: support asynchronous data path
> 
> 
> 
> On 10/25/22 17:44, Bruce Richardson wrote:
> > On Tue, Oct 25, 2022 at 05:33:31PM +0200, Maxime Coquelin wrote:
> >>
> >>
> >> On 10/25/22 11:15, Hu, Jiayu wrote:
> >
> >>>> I think that for Vhost PMD, the Virtio completions should either be
> >>>> performed by DMA engine or by a dedicated thread.
> >>>
> >>> We cannot depend on DMA engine to do completion, as there is no
> >>> ordering guarantee on the HW. For example, given the DMA engine
> >>> issues two updates on the used ring's index, it is possible that the
> >>> second write completes before the first one.
> >>
> >> I'm not sure for Intel hardware, but other vendors may offer ordering
> >> guarantees, it should be exposed as a capability of the DMA device.
> >> If the DMA device offers this capability, it could be used for Vhost.
> >>
> >
> > While I haven't been following this discussion, this particular
> > comment caught my eye.
> >
> > For jobs submitted via a single dmadev device, the "FENCE" flag is
> > provided as part of the dmadev API[1]. Obviously, if the writes come
> > from different dmadevs, then things are rather more complicated.
>

The cost of "FENCE" is significant, as it requires the HW to stop pipeline and
wait until all previous jobs completed. If ask DMA to update used ring's index,
every updating on used ring's index requires "FENCE", which will make enqueue/
dequeue become sync operation finally. "FENCE" would slow down DMA and make
DMA become the bottleneck quickly. Also, stalling HW pipeline also means waste
DMA resource. The more efficient way to maintain the ordering is by SW. I don't
think it is acceptable from performance and resource utilization perspective.

Thanks,
Jiayu

> Thanks for the clarification Bruce.
> 
> In the Vhost PMD case, there is a 1:1 mapping between the virtqueue and
> the DMA channel, so we should be fine.
> 
> Regards,
> Maxime
> 
> > /Bruce
> >
> >
> > [1]
> >
> https://doc.dpdk.org/api/rte__dmadev_8h.html#a3375e7b956b305505073c4
> ff
> > 035afe2f
> >

Reply via email to