On Thu, Feb 13, 2020 at 5:14 PM Doherty, Declan <declan.dohe...@intel.com> wrote: > > On 06/02/2020 5:13 PM, Jerin Jacob wrote: > > On Thu, Feb 6, 2020 at 10:01 PM Coyle, David <david.co...@intel.com> wrote: > > > > Hi David, > > > >>> > >>> > >>>>>> - XGS-PON MAC: Crypto-CRC-BIP > >>>>>> - Order: > >>>>>> - Downstream: CRC, Encrypt, BIP > >>>>> > >>>>> I understand if the chain has two operations then it may possible to > >>>>> have handcrafted SW code to do both operations in one pass. > >>>>> I understand the spec is agnostic on a number of passes it does > >>>>> require to enable the xfrom but To understand the SW/HW capability, > >>>>> In the above case, "CRC, Encrypt, BIP", It is done in one pass in SW > >>>>> or three passes in SW or one pass using HW? > >>>> > >>>> [DC] The CRC, Encrypt, BIP is also currently done as 1 pass in AESNI MB > >>> library SW. > >>>> However, this could also be performed as a single pass in a HW > >>>> accelerator > >>> > >>> As a specification, cascading the xform chains make sense. > >>> Do we have any HW that does support chaining the xforms more than "two" > >>> in one pass? > >>> i.e real chaining function where two blocks of HWs work hand in hand for > >>> chaining. > >>> If none, it may be better to abstract as synonymous API(No dequeue, no > >>> enqueue) for the CPU use case. > >> > >> [DC] I'm not aware of any HW that supports this at the moment, but that's > >> not to say it couldn't in the future - if anyone else has any examples > >> though, please feel free to share. > >> Regardless, I don't see why we would introduce a different API for SW > >> devices and HW devices. > > > > There is a risk in drafting API that meant for HW without any HW > > exists. Because there could be inefficiency on the metadata and fast > > path API for both models. > > For example, In the case of CPU based scheme, it will be pure overhead > > emulate the "queue"(the enqueue and dequeue) for the sake of > > abstraction where > > CPU works better in the synchronous model and I have doubt that the > > session-based scheme will work for HW or not as both difference HW > > needs to work hand in hand(IOMMU aspects for two PCI device) > > We do have some proto-types in hardware which can do operation chaining > but in the case we have looked at, it is a single accelerator device > with multi-function which means the orchestration (order, passing of > data etc) of the chained operations is handled within the device itself, > meaning that we didn't see issues with shared session data or handling > moving data along discrete independent stage of a hardware pipeline > wasn't an issue. > > Although if you wanted to offer this type of chained offload, I think we > would need the driver to handle this for the user, rather than the > application needing to understand how the hardware pipeline is interacting.
Yes. The application should not understand the specifics. The only question how to make this generic so that any hardware/SW pipeline can work. Currently, we have rte_security, which works on ethdev and cryptodev. This new spec is going to work on rte_cryptodev and rte_compressdev. If so, we need another pipeline which needs to work with rte_cryptodev, rte_compressdev and ethdev then we need to invent a new library. I agree with the need for the hardware/SW pipeline. As Stephen suggested, Why not look for general abstraction for HW/SW based pipeline. Marvell had a similar problem in abstracting various HW/SW pipeline, Here is a proposal for a generic HW/SW pipeline. http://mails.dpdk.org/archives/dev/2020-January/156765.html If the focus only for a specific case, say "CRC + something else", better to have API for that and better to not call the accelerator for the packet processing pipeline as it has a big scope. Just my 2c. > > > > > Having said that, I agree with the need for use case and API for CPU > > case. Till we find a HW spec, we need to make the solution as CPU > > specific and latter extend based on HW metadata required. > > Accelerator API sounds like HW accelerator and there is no HW support > > then it may not good. We can change the API that works for the use > > cases that we know how it works efficiently. > > > > > > > > > > > > > > > >> It would be up to each underlying PMD to decide if/how it supports a > >> particular accelerator xform chain, but from an application's point of > >> view, the accelerator API is always the same > >> > >> >