> -----Original Message-----
> From: Thomas Monjalon <tho...@monjalon.net>
> Sent: Wednesday, November 6, 2019 12:19 PM
> To: Ananyev, Konstantin <konstantin.anan...@intel.com>
> Cc: techbo...@dpdk.org; Honnappa Nagarahalli <honnappa.nagaraha...@arm.com>;
> dev@dpdk.org; Zhang, Roy Fan
> <roy.fan.zh...@intel.com>; Doherty, Declan <declan.dohe...@intel.com>;
> akhil.go...@nxp.com; nd <n...@arm.com>
> Subject: Re: [dpdk-techboard] [RFC 0/4] cpu-crypto API choices
>
> 06/11/2019 12:33, Ananyev, Konstantin:
> >
> > > > > > > Originally both SW and HW crypto PMDs use rte_crypot_op based API
> > > > > > > to
> > > > > > > process the crypto workload asynchronously. This way provides
> > > > > > > uniformity to
> > > > > > > both PMD types, but also introduce unnecessary performance
> > > > > > > penalty to SW
> > > > > > > PMDs that have to "simulate" HW async behavior (crypto-ops
> > > > > > > enqueue/dequeue, HW addresses computations, storing/dereferencing
> > > > > > > user
> > > > > > > provided data (mbuf) for each crypto-op, etc).
> > > > > > >
> > > > > > > The aim is to introduce a new optional API for SW crypto-devices
> > > > > > > to perform
> > > > > > > crypto processing in a synchronous manner.
> > > > > > > As summarized by Akhil, we need a synchronous API to perform
> > > > > > > crypto
> > > > > > > operations on raw data using SW PMDs, that provides:
> > > > > > > - no crypto-ops.
> > > > > > > - avoid using mbufs inside this API, use raw data buffers
> > > > > > > instead.
> > > > > > > - no separate enqueue-dequeue, only single process() API for
> > > > > > > data path.
> > > > > > > - input data buffers should be grouped by session,
> > > > > > > i.e. each process() call takes one session and group of input
> > > > > > > buffers
> > > > > > > that belong to that session.
> > > > > > > - All parameters that are constant accross session, should be
> > > > > > > stored
> > > > > > > inside the session itself and reused by all incoming data
> > > > > > > buffers.
> > > > > > >
> > > > > > > While there seems no controversy about need of such
> > > > > > > functionality, there
> > > > > > > seems to be no agreement on what would be the best API for that.
> > > > > > > So I am requesting for TB input on that matter.
> > > > > > >
> > > > > > > Series structure:
> > > > > > > - patch #1 - intorduce basic data structures to be used by sync
> > > > > > > API
> > > > > > > (no controversy here, I hope ..)
> > > > > > > [RFC 1/4] cpu-crypto: Introduce basic data structures
> > > > > > > - patch #2 - Intel initial approach for new API (via rte_security)
> > > > > > > [RFC 2/4] security: introduce cpu-crypto API
> > > > > > > - patch #3 - approach that reuses existing rte_cryptodev API as
> > > > > > > much as
> > > > > > > possible
> > > > > > > [RFC 3/4] cryptodev: introduce cpu-crypto API
> > > > > > > - patch #4 - approach via introducing new session data structure
> > > > > > > and API
> > > > > > > [RFC 4/4] cryptodev: introduce rte_crypto_cpu_sym_session API
> > > > > > >
> > > > > > > Patches 2,3,4 are mutually exclusive,
> > > > > > > and we probably have to choose which one to go forward with.
> > > > > > > I put some explanations in each of the patches, hopefully that
> > > > > > > will help to
> > > > > > > understand pros and cons of each one.
> > > > > > >
> > > > > > > Akhil strongly supports #3, AFAIK mainly because it allows PMDs
> > > > > > > to reuse
> > > > > > > existing API and minimize API level changes.
> > > > > >
> > > > > > IMO, from application perspective, it should not matter who (CPU or
> > > > > > an accelerator) does the crypto functionality. It just needs to
> > > know
> > > > if the result will be returned synchronously or asynchronously.
> > > > >
> > > > > We already have asymmetric and symmetric APIs.
> > > > > Here you are proposing a third method: symmetric without mbuf for CPU
> > > > > PMDs
> > > >
> > > > Sorry, for this garbage, I am mixing synchronous/asynchronous and
> > > > symmetric/asymmetric.
> > > >
> > > > > > > My favorite is #4, #2 is less preferable but ok too.
> > > > > > > #3 seems problematic to me by the reasons I outlined in #4 patch
> > > > > > > description.
> > > > > > >
> > > > > > > Please provide your opinion.
> > > > >
> > > > > It means the API is not PMD agnostic, right?
> > >
> > > Probably not...
> > > Because inside DPDK we don't have any other abstraction for SW crypto-libs
> > > except vdev, we do need dev_id to get session initialization point.
> > > After that I believe all operations can be session based.
> > >
> > > > So the question is to know if a synchronous API will be implemented
> > > > only for CPU virtual PMDs?
> > >
> > > I don't expect lookaside devices to benefit from sync mode.
> > > I think performance penalty would be too high.
> >
> > After another thought, if some lookaside PMD would like to support such API
> > -
> > I think it is still possible: dev_id (or just pointer to internal dev/queue
> > structure)
> > can be stored inside the session itself.
> > Though I really doubt any lookaside PMD would be interested in such mode.
>
> So what should be the logic in the application?
> How the combo PMD/API is chosen?
Up to the user.
At session creation time user has to choose what session he wants to use.
Then at data-path he can either call async API (enqueue/dequeue)
or sync API (process).
I expect users who do care about extra perf will choose cpu-crypto mode
when it is available.
Existing apps and apps who'd like to have just one code-path
would stay with async mode and will be unaffected.
> How does it work with the crypto scheduler?
If we want to add cpu-crypto support to crypto-scheduler PMD,
then changes would be needed anyway, not matter will we choose #3 or #4.