On 06/02/2020 5:13 PM, Jerin Jacob wrote:
On Thu, Feb 6, 2020 at 10:01 PM Coyle, David <david.co...@intel.com> wrote:
Hi David,
- XGS-PON MAC: Crypto-CRC-BIP
- Order:
- Downstream: CRC, Encrypt, BIP
I understand if the chain has two operations then it may possible to
have handcrafted SW code to do both operations in one pass.
I understand the spec is agnostic on a number of passes it does
require to enable the xfrom but To understand the SW/HW capability,
In the above case, "CRC, Encrypt, BIP", It is done in one pass in SW
or three passes in SW or one pass using HW?
[DC] The CRC, Encrypt, BIP is also currently done as 1 pass in AESNI MB
library SW.
However, this could also be performed as a single pass in a HW
accelerator
As a specification, cascading the xform chains make sense.
Do we have any HW that does support chaining the xforms more than "two"
in one pass?
i.e real chaining function where two blocks of HWs work hand in hand for
chaining.
If none, it may be better to abstract as synonymous API(No dequeue, no
enqueue) for the CPU use case.
[DC] I'm not aware of any HW that supports this at the moment, but that's not
to say it couldn't in the future - if anyone else has any examples though,
please feel free to share.
Regardless, I don't see why we would introduce a different API for SW devices
and HW devices.
There is a risk in drafting API that meant for HW without any HW
exists. Because there could be inefficiency on the metadata and fast
path API for both models.
For example, In the case of CPU based scheme, it will be pure overhead
emulate the "queue"(the enqueue and dequeue) for the sake of
abstraction where
CPU works better in the synchronous model and I have doubt that the
session-based scheme will work for HW or not as both difference HW
needs to work hand in hand(IOMMU aspects for two PCI device)
We do have some proto-types in hardware which can do operation chaining
but in the case we have looked at, it is a single accelerator device
with multi-function which means the orchestration (order, passing of
data etc) of the chained operations is handled within the device itself,
meaning that we didn't see issues with shared session data or handling
moving data along discrete independent stage of a hardware pipeline
wasn't an issue.
Although if you wanted to offer this type of chained offload, I think we
would need the driver to handle this for the user, rather than the
application needing to understand how the hardware pipeline is interacting.
Having said that, I agree with the need for use case and API for CPU
case. Till we find a HW spec, we need to make the solution as CPU
specific and latter extend based on HW metadata required.
Accelerator API sounds like HW accelerator and there is no HW support
then it may not good. We can change the API that works for the use
cases that we know how it works efficiently.
It would be up to each underlying PMD to decide if/how it supports a particular
accelerator xform chain, but from an application's point of view, the
accelerator API is always the same