Hi lads, > > > > You are right, the new API will process the crypto workload, no heavy > > enqueue > > Dequeue operations required. > > > > Cryptodev tends to support multiple crypto devices, including HW and SW. > > The 3-cache line access, iova address computation and assignment, simulation > > of async enqueue/dequeue operations, allocate and free crypto ops, even the > > mbuf linked-list for scatter-gather buffers are too heavy for SW crypto > > PMDs. > > Why cant we have a cryptodev synchronous API which work on plain bufs as your > suggested > API and use the same crypto sym_session creation logic as it was before? It > will perform > same as it is doing in this series.
I tried to summarize our reasons in another mail in that thread. > > > > > To create this new synchronous API in cryptodev cannot avoid the problem > > listed above: first the API shall not serve only to part of the crypto > > (SW) PMDs - > > as you know, it is Cryptodev. The users can expect some PMD only support > > part > > of the overall algorithms, but not the workload processing API. > > Why cant we have an optional data path in cryptodev for synchronous behavior > if the > underlying PMD support it. It depends on the PMD to decide whether it can > have it supported or not. > Only a feature flag will be needed to decide that. > One more option could be a PMD API which the application can directly call if > the > mode is only supported in very few PMDs. This could be a backup if there is a > requirement of deprecation notice etc. > > > > > Another reason is, there is assumption made, first when creating a crypto op > > we have to allocate the memory to hold crypto op + sym op + iv, - we cannot > > simply declare an array of crypto ops in the run-time and discard it when > > processing > > is done. Also we need to fill aad and digest HW address, which is not > > required for > > SW at all. > > We are defining a new API which may have its own parameters and requirements > which > Need to be fulfilled. In case it was a rte_security API, then also you are > defining a new way > Of packet execution and API params. So it would be same. > You can reduce the cache line accesses as you need in the new API. > The session logic need not be changed from crypto session to security session. > Only the data patch need to be altered as per the new API. > > > > > Bottom line: using crypto op will still have 3 cache-line access performance > > problem. > > > > So if we to create the new API in Cryptodev instead of rte_security, we > > need to > > create new crypto op structure only for the SW PMDs, carefully document them > > to not confuse with existing cryptodev APIs, make new device feature flags > > to > > indicate the API is not supported by some PMDs, and again carefully document > > them of these device feature flags. > > The explanation of the new API will also happen in case it is a security API. > Instead you need > to add more explanation for session also which is already there in cryptodev. > > > > > So, to push these changes to rte_security instead the above problem can be > > resolved, > > and the performance improvement because of this change is big for smaller > > packets > > - I attached a performance test app in the patchset. > > I believe there wont be any perf gap in case the optimized new cryptodev API > is used. > > > > > For rte_security, we already have inline-crypto type that works quite close > > to the > > this > > new API, the only difference is that it is processed by the CPU cycles. As > > you may > > have already seen the ipsec-library has wrapped these changes, and > > ipsec-secgw > > has only minimum updates to adopt this change too. So to the end user, if > > they > > use IPSec this patchset can seamlessly enabled with just commandline update > > when > > creating an SA. > > In the IPSec application I do not see the changes wrt the new execution API. > So the data path is not getting handled there. It looks incomplete. The user > experience > to use the new API will definitely be changed. I believe we do support it for libtre_ipsec mode. librte_ipsec hides all processing complexity inside and does call rte_security_process_cpu_crypto_bulk() internally. That's why for librte_ipsec it is literally 2 lines change: --- a/examples/ipsec-secgw/ipsec_process.c +++ b/examples/ipsec-secgw/ipsec_process.c @@ -101,7 +101,8 @@ fill_ipsec_session(struct rte_ipsec_session *ss, struct ipsec_ctx *ctx, } ss->crypto.ses = sa->crypto_session; /* setup session action type */ - } else if (sa->type == RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL) { + } else if (sa->type == RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL || + sa->type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) { if (sa->sec_session == NULL) { rc = create_lookaside_session(ctx, sa); if (rc != 0) @@ -227,8 +228,8 @@ ipsec_process(struct ipsec_ctx *ctx, struct ipsec_traffic *trf) /* process packets inline */ else if (sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO || - sa->type == - RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL) { + sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL || + sa->type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) { satp = rte_ipsec_sa_type(ips->sa);