On 07/02/2020 2:18 PM, Jerin Jacob wrote:
On Fri, Feb 7, 2020 at 6:08 PM Coyle, David <david.co...@intel.com> wrote:
Hi Jerin, see below
Hi David,
On Thu, Feb 6, 2020 at 10:01 PM Coyle, David <david.co...@intel.com>
wrote:
There is a risk in drafting API that meant for HW without any HW exists.
Because there could be inefficiency on the metadata and fast path API for
both models.
For example, In the case of CPU based scheme, it will be pure overhead
emulate the "queue"(the enqueue and dequeue) for the sake of abstraction
where CPU works better in the synchronous model and I have doubt that the
session-based scheme will work for HW or not as both difference HW needs
to work hand in hand(IOMMU aspects for two PCI device)
[DC] I understand what you are saying about the overhead of emulating the "sw
queue" but this same model is already used in many of the existing device PMDs.
In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for
compression, the enqueue/dequeue in the PMD is emulated through an rte_ring
which is very efficient.
The accelerator API will use the existing device PMDs so keeping the same model
seems like a sensible approach.
In this release, we added CPU crypto support in cryptodev to support
the synchronous model to fix the overhead.
From an application's point of view, this abstraction of the underlying device
type is important for usability and maintainability - the application doesn't
need to know
the device type as such and therefore doesn't need to make different API calls.
The enqueue/dequeue type API was also used with QAT in mind. While QAT HW
doesn't support these xform chains at the moment, it could potentially do so in
the future.
As a side note, as part of the work of adding the accelerator API, the QAT PMD
will be updated to support the DOCSIS Crypto-CRC accelerator xform chain, where
the Crypto
is done on QAT HW and the CRC will be done in SW, most likely through a call to
the optimized rte_net_crc library. This will give a consistent API for the
DOCSIS-MAC data-plane
pipeline prototype we have developed, which uses both AESNI-MB and QAT for
benchmarks.
We will take your feedback on the enqueue/dequeue approach for SW devices into
consideration though during development.
Finally, I'm unsure what you mean by this line:
"I have doubt that the session-based scheme will work for HW or not as both
difference HW needs to work hand in hand(IOMMU aspects for two PCI device)"
What do mean by different HW working "hand in hand" and "two PCI device"?
The intention is that 1 HW device (or it's PMD) would have to support the accel
xform chain
I was thinking, it will be N PCIe devices that create the chain. Each
distinct PCI device does the fixed-function and chains them together.
The case we were looking at is more focused on a single discrete
(multi-function) device (from the perspective of the host) providing a
number of transforms (operations) in a single pass rather than the case
of N discrete hardware devices (from the perspective of the host)
chained together to achieve the same transforms set.
I do understand the usage of QAT HW and CRC in SW.
So If I understand it correctly, in rte_security, we are combining
rte_ethdev and rte_cryptodev. With this spec, we are trying to
combine,
rte_cryptodev and rte_compressdev. So it looks good to me. My only
remaining concern is the name of this API, accelerator too generic
name. IMO, like rte_security, we may need to give more meaningful name
for the use case where crytodev and compressdev can work together.