Re: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions

Doherty, Declan Thu, 13 Feb 2020 03:51:13 -0800

On 07/02/2020 2:18 PM, Jerin Jacob wrote:

On Fri, Feb 7, 2020 at 6:08 PM Coyle, David <[email protected]> wrote:


Hi Jerin, see below


Hi David,


On Thu, Feb 6, 2020 at 10:01 PM Coyle, David <[email protected]>
wrote:


There is a risk in drafting API that meant for HW without any HW exists.
Because there could be inefficiency on the metadata and fast path API for
both models.
For example, In the case of CPU based scheme, it will be pure overhead
emulate the "queue"(the enqueue and dequeue) for the sake of abstraction
where CPU works better in the synchronous model and I have doubt that the
session-based scheme will work for HW or not as both difference  HW needs
to work hand in hand(IOMMU aspects for two PCI device)


[DC] I understand what you are saying about the overhead of emulating the "sw 
queue" but this same model is already used in many of the existing device PMDs.
In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for 
compression, the enqueue/dequeue in the PMD is emulated through an rte_ring 
which is very efficient.
The accelerator API will use the existing device PMDs so keeping the same model 
seems like a sensible approach.


In this release, we added CPU crypto support in cryptodev to support
the synchronous model to fix the overhead.


 From an application's point of view, this abstraction of the underlying device 
type is important for usability and maintainability -  the application doesn't 
need to know
the device type as such and therefore doesn't need to make different API calls.

The enqueue/dequeue type API was also used with QAT in mind. While QAT HW 
doesn't support these xform chains at the moment, it could potentially do so in 
the future.
As a side note, as part of the work of adding the accelerator API, the QAT PMD 
will be updated to support the DOCSIS Crypto-CRC accelerator xform chain, where 
the Crypto
is done on QAT HW and the CRC will be done in SW, most likely through a call to 
the optimized rte_net_crc library. This will give a consistent API for the 
DOCSIS-MAC data-plane
pipeline prototype we have developed, which uses both AESNI-MB and QAT for 
benchmarks.

We will take your feedback on the enqueue/dequeue approach for SW devices into 
consideration though during development.

Finally, I'm unsure what you mean by this line:

         "I have doubt that the session-based scheme will work for HW or not as both 
difference  HW needs to work hand in hand(IOMMU aspects for two PCI device)"

What do mean by different HW working "hand in hand" and "two PCI device"?
The intention is that 1 HW device (or it's PMD) would have to support the accel 
xform chain


I was thinking, it will be N PCIe devices that create the chain. Each
distinct PCI device does the fixed-function and chains them together.

The case we were looking at is more focused on a single discrete(multi-function) device (from the perspective of the host) providing anumber of transforms (operations) in a single pass rather than the caseof N discrete hardware devices (from the perspective of the host)chained together to achieve the same transforms set.

I do understand the usage of QAT HW and CRC in SW.
So If I understand it correctly, in rte_security, we are combining
rte_ethdev and rte_cryptodev. With this spec, we are trying to
combine,
rte_cryptodev and rte_compressdev. So it looks good to me. My only
remaining concern is the name of this API, accelerator too generic
name. IMO, like rte_security, we may need to give more meaningful name
for the use case where crytodev and compressdev can work together.

Re: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions

Reply via email to