Hi Fan,

> Hi Akhil,
> 
> > > +
> > > +/**
> > > + * Asynchronous operation job descriptor.
> > > + * Used by HW crypto devices direct API call that supports such activity
> > > + **/
> > > +struct rte_crypto_sym_job {
> > > + union {
> > > +         /**
> > > +          * When RTE_CRYPTO_HW_ENQ_FLAG_IS_SGL bit is set in
> > flags,
> > > sgl
> > > +          * field is used as input data. Otherwise data_iova is
> > > +          * used.
> > > +          **/
> > > +         rte_iova_t data_iova;
> > > +         struct rte_crypto_sgl *sgl;
> > > + };
> > > + union {
> > > +         /**
> > > +          * Different than cryptodev ops, all ofs and len fields have
> > > +          * the unit of bytes (including Snow3G/Kasumi/Zuc.
> > > +          **/
> > > +         struct {
> > > +                 uint32_t cipher_ofs;
> > > +                 uint32_t cipher_len;
> > > +         } cipher_only;
> > > +         struct {
> > > +                 uint32_t auth_ofs;
> > > +                 uint32_t auth_len;
> > > +                 rte_iova_t digest_iova;
> > > +         } auth_only;
> > > +         struct {
> > > +                 uint32_t aead_ofs;
> > > +                 uint32_t aead_len;
> > > +                 rte_iova_t tag_iova;
> > > +                 uint8_t *aad;
> > > +                 rte_iova_t aad_iova;
> > > +         } aead;
> > > +         struct {
> > > +                 uint32_t cipher_ofs;
> > > +                 uint32_t cipher_len;
> > > +                 uint32_t auth_ofs;
> > > +                 uint32_t auth_len;
> > > +                 rte_iova_t digest_iova;
> > > +         } chain;
> > > + };
> > > + uint8_t *iv;
> > > + rte_iova_t iv_iova;
> > > +};
> >
> > NACK,
> > Why do you need this structure definitions again when you have similar ones
> > (the ones used in CPU crypto) available for the same purpose.
> > In case of CPU crypto, there were 2 main requirements
> > - synchronous API instead of enq +deq
> > - raw buffers.
> >
> 
> As you may have seen the structure definition is hW centric with the
> IOVA addresses all over. Also as you will from the patch series the operation 
> is
> Per operation basis instead of operating in a burst. The external application
> may sooner know when a specific enqueue is failed.

You may also need to save a virtual address as well. As some hardware are able 
to
Convert virtual to physical addresses on it's own giving a performance 
improvement.

I do not see an issue in using enqueue burst with burst size=1 , but since you 
are doing
Optimizations, none of the hardware can perform well with burst = 1, I think it 
is always
Greater than 1.
> 
> > Now for this patchset, the requirement is
> > - raw buffers
> > - asynchronous APIs
> >
> > The data structure for raw buffers and crypto related offsets are already
> > defined
> > So they should be reused.
> > And I believe with some changes in rte_crypto_op  and rte_crypto_sym_op,
> > We can support raw buffers with the same APIs.
> > Instead of m_src and m_dst, raw buffer data structures can be combined in a
> > Union and some of the fields in the rte_crypto_op can be left NULL in case 
> > of
> > raw buffers.
> >
> 
> This is a good point but we still face too many unnecessary fields to be NULL,
> such as
> digest pointers, I have given a lot thought to this structure. Hopefully it 
> covers
> all vendor's HW symmetric crypto needs and in the same time it well squeeze
> the required HW addresses into 1 cacheline, instead of rte_crypto_op +
> rte_crypto_sym_op 3 cacheline footprint. Another purpose of the structure
> design
> is the structure buffer can be taken from stack and can be used to fill all
> jobs to the PMD HW.

Which fields you think are not useful and should be set as NULL?
Digest pointers you are anyways setting in the new structure.
Your new struct does not support session less as well as security sessions.
It does not take care of asymmetric crypto.
So whenever, a vendor need to support all these, we would end up getting
the rte_crypto_op structure.
IMO, you only need to make m_src and m_dst as union to a raw input/output
buffers. Everything else will be relevant.

Have you done some profiling with using rte_crypto_op instead of this new 
struct?

> 
> >
> > > +/* Struct for user to perform HW specific enqueue/dequeue function calls
> > */
> > > +struct rte_crypto_hw_ops {
> > > + /* Driver written queue pair data pointer, should NOT be alterred by
> > > +  * the user.
> > > +  */
> > > + void *qp;
> > > + /* Function handler to enqueue AEAD job */
> > > + rte_crypto_hw_enq_cb_fn enqueue_aead;
> > > + /* Function handler to enqueue cipher only job */
> > > + rte_crypto_hw_enq_cb_fn enqueue_cipher;
> > > + /* Function handler to enqueue auth only job */
> > > + rte_crypto_hw_enq_cb_fn enqueue_auth;
> > > + /* Function handler to enqueue cipher + hash chaining job */
> > > + rte_crypto_hw_enq_cb_fn enqueue_chain;
> > > + /* Function handler to query processed jobs */
> > > + rte_crypto_hw_query_processed query_processed;
> > > + /* Function handler to dequeue one job and return opaque data
> > stored
> > > */
> > > + rte_crypto_hw_deq_one_cb_fn dequeue_one;
> > > + /* Function handler to dequeue many jobs */
> > > + rte_crypto_hw_deq_many_cb_fn dequeue_many;
> > > + /* Reserved */
> > > + void *reserved[8];
> > > +};
> >
> > Why do we need such callbacks in the library?
> > These should be inside the drivers, or else we do the same for
> > Legacy case as well. The pain of finding the correct enq function for
> > Appropriate crypto operation is already handled by all the drivers
> > And we can reuse that or else we modify it there as well.
> >
> 
> Providing different types of enqueue functions for specific operation type
> could save a lot of branches for the driver to handle. As mentioned this
> data-path API is intended to be used as an advanced feature to provide
> close-to-native perf to external library/applications that are not mbuf
> centric. And I don't agree classifying choosing 1 enqueue function from
> 4 candidates as "pain".

My point is why don't we have it in the Legacy code path as well?
I think it is useful in both the paths. Branching is a pain for the driver.

> 
> > We should not add a lot of data paths for the user, otherwise the
> > APIs will become centric to a particular vendor and it will be very 
> > difficult
> > For the user to migrate from one vendor to another and would defeat the
> > Purpose of DPDK which provide uniform abstraction layer for all the
> > hardware
> > Vendors.
> >
> 
> The purpose of adding data-path for the user is performance for non-mbuf
> data-path centric applications/libraries, in the meantime not creating
> confusion. In this version we aim to provide a more friendly data-path for

I do not see the new path as friendly.
Adding a parallel new datapath with create more confusion for the application
developer. It would be convenient, if we can use the same path with minimal
changes so that people can migrate easily.

> them, and aims to be useful to all vendor's PMDs. If there is any place in
> the API that blocks a PMD please let me know.

As commented above, sessionless, rte_security sessions, asymmetric crypto
Not supported.


> 
> > Adding other vendors to comment.
> >
> > Regards,
> > Akhil

Reply via email to