Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API

Ananyev, Konstantin Tue, 22 Oct 2019 15:21:54 -0700

> > > > Added my comments inline with your draft.
> > > > [snip]..
> > > >
> > > > >
> > > > > Ok, then my suggestion:
> > > > > Let's at least write down all points about crypto-dev approach where 
> > > > > we
> > > > > disagree and then probably try to resolve them one by one....
> > > > > If we fail to make an agreement/progress in next week or so,
> > > > > (and no more reviews from the community)
> > > > > will have bring that subject to TB meeting to decide.
> > > > > Sounds fair to you?
> > > > Agreed
> > > > >
> > > > > List is below.
> > > > > Please add/correct me, if I missed something.
> > > > >
> > > > > Konstantin
> > > >
> > > > Before going into comparison, we should define the requirement as well.
> > >
> > > Good point.
> > >
> > > > What I understood from the patchset,
> > > > "You need a synchronous API to perform crypto operations on raw data 
> > > > using
> > > SW PMDs"
> > > > So,
> > > > - no crypto-ops,
> > > > - no separate enq-deq, only single process API for data path
> > > > - Do not need any value addition to the session parameters.
> > > >   (You would need some parameters from the crypto-op which
> > > >    Are constant per session and since you wont use crypto-op,
> > > >    You need some place to store that)
> > >
> > > Yes, this is correct, I think.
> > >
> > > >
> > > > Now as per your mail, the comparison
> > > > 1. extra input parameters to create/init rte_(cpu)_sym_session.
> > > >
> > > > Will leverage existing 6B gap inside rte_crypto_*_xform between 'algo' 
> > > > and
> > > 'key' fields.
> > > > New fields will be optional and would be used by PMD only when 
> > > > cpu-crypto
> > > session is requested.
> > > > For lksd-crypto session PMD is free to ignore these fields.
> > > > No ABI breakage is required.
> > > >
> > > > [Akhil] Agreed, no issues.
> > > >
> > > > 2. cpu-crypto create/init.
> > > >     a) Our suggestion - introduce new API for that:
> > > >         - rte_crypto_cpu_sym_init() that would init completely opaque
> > > rte_crypto_cpu_sym_session.
> > > >         - struct rte_crypto_cpu_sym_session_ops {(*process)(...); 
> > > > (*clear);
> > > /*whatever else we'll need *'};
> > > >         - rte_crypto_cpu_sym_get_ops(const struct rte_crypto_sym_xform
> > > *xforms)
> > > >           that would return const struct rte_crypto_cpu_sym_session_ops 
> > > > *based
> > > on input xforms.
> > > >         Advantages:
> > > >         1)  totally opaque data structure (no ABI breakages in future), 
> > > > PMD
> > > writer is totally free
> > > >              with it format and contents.
> > > >
> > > > [Akhil] It will have breakage at some point till we don't hit the union 
> > > > size.
> > >
> > > Not sure, what union you are talking about?
> >
> > Union of xforms in rte_security_session_conf
> 
> Hmm, how does it relates here?
> I thought we discussing pure rte_cryptodev_sym_session, no?
> 
> >
> > >
> > > > Rather I don't suspect there will be more parameters added.
> > > > Or do we really care about the ABI breakage when the argument is about
> > > > the correct place to add a piece of code or do we really agree to add 
> > > > code
> > > > anywhere just to avoid that breakage.
> > >
> > > I am talking about maintaining it in future.
> > > if your struct is not seen externally, no chances to introduce ABI 
> > > breakage.
> > >
> > > >
> > > >         2) each session entity is self-contained, user doesn't need to 
> > > > bring along
> > > dev_id etc.
> > > >             dev_id is needed  only at init stage, after that user will 
> > > > use session ops
> > > to perform
> > > >             all operations on that session (process(), clear(), etc.).
> > > >
> > > > [Akhil] There is nothing called as session ops in current DPDK.
> > >
> > > True, but it doesn't mean we can't/shouldn't have it.
> >
> > We can have it if it is not adding complexity for the user. Creating 2 
> > different code
> > Paths for user is not desirable for the stack developers.
> >
> > >
> > > > What you are proposing
> > > > is a new concept which doesn't have any extra benefit, rather it is 
> > > > adding
> > > complexity
> > > > to have two different code paths for session create.
> > > >
> > > >
> > > >         3) User can decide does he wants to store ops[] pointer on a 
> > > > per session
> > > basis,
> > > >             or on a per group of same sessions, or...
> > > >
> > > > [Akhil] Will the user really care which process API should be called 
> > > > from the
> > > PMD.
> > > > Rather it should be driver's responsibility to store that in the 
> > > > session private
> > > data
> > > > which would be opaque to the user. As per my suggestion same process
> > > function can
> > > > be added to multiple sessions or a single session can be managed inside 
> > > > the
> > > PMD.
> > >
> > > In that case we either need to have a function per session (stored 
> > > internally),
> > > or make decision (branches) at run-time.
> > > But as I said in other mail - I am ok to add small shim structure here:
> > > either rte_crypto_cpu_sym_session { void *ses; struct
> > > rte_crypto_cpu_sym_session_ops ops; }
> > > or rte_crypto_cpu_sym_session { void *ses; struct
> > > rte_crypto_cpu_sym_session_ops *ops; }
> > > And merge rte_crypto_cpu_sym_init() and rte_crypto_cpu_sym_get_ops() into
> > > one (init).
> >
> > Again that will be a separate API call from the user perspective which is 
> > not good.
> >
> > >
> > > >
> > > >
> > > >         4) No mandatory mempools for private sessions. User can allocate
> > > memory for cpu-crypto
> > > >             session whenever he likes.
> > > >
> > > > [Akhil] you mean session private data?
> > >
> > > Yes.
> > >
> > > > You would need that memory anyways, user will be
> > > > allocating that already.  You do not need to manage that.
> > >
> > > What I am saying - right now user has no choice but to allocate it via 
> > > mempool.
> > > Which is probably not the best options for all cases.
> > >
> > > >
> > > >         Disadvantages:
> > > >         5) Extra changes in control path
> > > >         6) User has to store session_ops pointer explicitly.
> > > >
> > > > [Akhil] More disadvantages:
> > > > - All supporting PMDs will need to maintain TWO types of session for the
> > > > same crypto processing. Suppose a fix or a new feature(or algo) is 
> > > > added, PMD
> > > owner
> > > > will need to add code in both the session create APIs. Hence more
> > > maintenance and
> > > > error prone.
> > >
> > > I think majority of code for both paths will be common, plus even we'll 
> > > reuse
> > > current sym_session_init() -
> > > changes in PMD session_init() code will be unavoidable.
> > > But yes, it will be new entry in devops, that PMD will have to support.
> > > Ok to add it as 7) to the list.
> > >
> > > > - Stacks which will be using these new APIs also need to maintain two
> > > > code path for the same processing while doing session initialization
> > > > for sync and async
> > >
> > > That's the same as #5 above, I think.
> > >
> > > >
> > > >
> > > >      b) Your suggestion - reuse existing 
> > > > rte_cryptodev_sym_session_init() and
> > > existing rte_cryptodev_sym_session
> > > >       structure.
> > > >         Advantages:
> > > >         1) allows to reuse same struct and init/create/clear() 
> > > > functions.
> > > >             Probably less changes in control path.
> > > >         Disadvantages:
> > > >         2) rte_cryptodev_sym_session. sess_data[] is indexed by 
> > > > driver_id,
> > > which means that
> > > >             we can't use the same rte_cryptodev_sym_session to hold 
> > > > private
> > > sessions pointers
> > > >             for both sync and async mode  for the same device.
> > > >                    So the only option we have - make PMD devops-
> > > >sym_session_configure()
> > > >             always create a session that can work in both cpu and lksd 
> > > > modes.
> > > >             For some implementations that would probably mean that 
> > > > under the
> > > hood  PMD would create
> > > >             2 different session structs (sync/async) and then use one 
> > > > or another
> > > depending on from what API been called.
> > > >             Seems doable, but ...:
> > > >                    - will contradict with statement from 1:
> > > >               " New fields will be optional and would be used by PMD 
> > > > only when
> > > cpu-crypto session is requested."
> > > >                       Now it becomes mandatory for all apps to specify 
> > > > cpu-crypto
> > > related parameters too,
> > > >                even if they don't plan to use that mode - i.e. behavior 
> > > > change,
> > > existing app change.
> > > >                      - might cause extra space overhead.
> > > >
> > > > [Akhil] It will not contradict with #1, you will only have few checks 
> > > > in the
> > > session init PMD
> > > > Which support this mode, find appropriate values and set the appropriate
> > > process() in it.
> > > > User should be able to call, legacy enq-deq as well as the new process()
> > > without any issue.
> > > > User would be at runtime will be able to change the datapath.
> > > > So this is not a disadvantage, it would be additional flexibility for 
> > > > the user.
> > >
> > > Ok, but that's what I am saying - if PMD would *always* have to create a
> > > session that can handle
> > > both modes (sync/async), then user would *always* have to provide 
> > > parameters
> > > for both modes too.
> > > Otherwise if let say user didn't setup sync specific parameters at all, 
> > > what PMD
> > > should do?
> > >   - return with error?
> > >   - init session that can be used with async path only?
> > > My current assumption is #1.
> > > If #2, then how user will be able to distinguish is that session valid 
> > > for both
> > > modes, or only for one?
> >
> > I would say a 3rd option, do nothing if sync params are not set.
> > Probably have a debug print in the PMD(which support sync mode) to specify 
> > that
> > session is not configured properly for sync mode.
> 
> So, just print warning and proceed with init session that can be used with 
> async path only?
> Then it sounds the same as #2 above.
> Which actually means that sync mode parameters for sym_session_init() becomes 
> optional.
> Then we need an API to provide to the user information what modes
> (sync+async/async only) is supported by that session for given dev_id.
> And user would have to query/retain this information at control-path,
> and store it somewhere in user-space together with session pointer and dev_ids
> to use later at data-path (same as we do now for session type).
> That definitely requires changes in control-path to start using it.
> Plus the fact that this value can differ for different dev_ids for the same 
> session -
> doesn't make things easier here.
> 
> > Internally the PMD will not store the process() API in the session priv data
> > And while calling the first packet, devops->process will give an assert 
> > that session
> > Is not configured for sync mode. The session validation would be done in 
> > any case
> > your suggestion or mine. So no extra overhead at runtime.
> 
> I believe that after session_init() user should get either an error or
> valid  session handler that he can use at runtime.
> Pushing session validation to runtime doesn't seem like a good idea.
> 
> >
> > >
> > >
> > > >
> > > >
> > > >         3) not possible to store device (not driver) specific data 
> > > > within the
> > > session, but I think it is not really needed right now.
> > > >             So probably minor compared to 2.b.2.
> > > >
> > > > [Akhil] So lets omit this for current discussion. And I hope we can 
> > > > find some
> > > way to deal with it.
> > >
> > > I don't think there is an easy way to fix that with existing API.
> > >
> > > >
> > > >
> > > > Actually #3 follows from #2, but decided to have them separated.
> > > >
> > > > 3. process() parameters/behavior
> > > >     a) Our suggestion: user stores ptr to session ops (or to (*process) 
> > > > itself) and
> > > just does:
> > > >         session_ops->process(sess, ...);
> > > >         Advantages:
> > > >         1) fastest possible execution path
> > > >         2) no need to carry on dev_id for data-path
> > > >
> > > > [Akhil] I don't see any overhead of carrying dev id, at least it would 
> > > > be inline
> > > with the
> > > > current DPDK methodology.
> > >
> > > If we'll add process() into rte_cryptodev itself (same as we have
> > > enqueue_burst/dequeue_burst),
> > > then it will be an ABI breakage.
> > > Also there are discussions to get rid of that approach completely:
> > > http://mails.dpdk.org/archives/dev/2019-September/144674.html
> > > So I am not sure this is a recommended way these days.
> >
> > We can either have it in rte_cryptodev or in rte_cryptodev_ops whichever
> > is good for you.
> >
> > Whether it is ABI breakage or not, as per your requirements, this is the 
> > correct
> > approach. Do you agree with this or not?
> 
> I think it is possible approach, but not the best one:
> it looks quite flakey to me (see all these uncertainty with sym_session_init 
> above),
> plus introduces extra overhead at data-path.
> 
> >
> > Now handling the API/ABI breakage is a separate story. In 19.11 release we
> > Are not much concerned about the ABI breakages, this was discussed in
> > community. So adding a new dev_ops wouldn't have been an issue.
> > Now since we are so close to RC1 deadline, we should come up with some
> > other solution for next release. May be having a pmd API in 20.02 and
> > converting it into formal one in 20.11
> >
> >
> > >
> > > > What you are suggesting is a new way to get the things done without much
> > > benefit.
> > >
> > > Would help with ABI stability plus better performance, isn't it enough?
> > >
> > > > Also I don't see any performance difference as crypto workload is 
> > > > heavier than
> > > > Code cycles, so that wont matter.
> > >
> > > It depends.
> > > Suppose function call costs you ~30 cycles.
> > > If you have burst of big packets (let say crypto for each will take ~2K 
> > > cycles) that
> > > belong
> > > to the same session, then yes you wouldn't notice these extra 30 cycles 
> > > at all.
> > > If you have burst of small packets (let say crypto for each will take 
> > > ~300 cycles)
> > > each
> > > belongs to different session, then it will cost you ~10% extra.
> >
> > Let us do some profiling on openssl with both the approaches and find out 
> > the
> > difference.
> >
> > >
> > > > So IMO, there is no advantage in your suggestion as well.
> > > >
> > > >
> > > >         Disadvantages:
> > > >         3) user has to carry on session_ops pointer explicitly
> > > >     b) Your suggestion: add  (*cpu_process) inside rte_cryptodev_ops 
> > > > and then:
> > > >         rte_crypto_cpu_sym_process(uint8_t dev_id, 
> > > > rte_cryptodev_sym_session
> > > *sess, /*data parameters*/) {...
> > > >                      rte_cryptodevs[dev_id].dev_ops->cpu_process(ses, 
> > > > ...);
> > > >                       /*and then inside PMD specifc process: */
> > > >                      pmd_private_session = 
> > > > sess->sess_data[this_pmd_driver_id].data;
> > > >                      /* and then most likely either */
> > > >                      pmd_private_session->process(pmd_private_session, 
> > > > ...);
> > > >                      /* or jump based on session/input data */
> > > >         Advantages:
> > > >         1) don't see any...
> > > >         Disadvantages:
> > > >         2) User has to carry on dev_id inside data-path
> > > >         3) Extra level of indirection (plus data dependency) - both for 
> > > > data and
> > > instructions.
> > > >             Possible slowdown compared to a) (not measured).
> > > >
> > > > Having said all this, if the disagreements cannot be resolved, you can 
> > > > go for a
> > > pmd API specific
> > > > to your PMDs,
> > >
> > > I don't think it is good idea.
> > > PMD specific API is sort of deprecated path, also there is no clean way 
> > > to use it
> > > within the libraries.
> >
> > I know that this is a deprecated path, we can use it until we are not 
> > allowed
> > to break ABI/API
> >
> > >
> > > > because as per my understanding the solution doesn't look scalable to 
> > > > other
> > > PMDs.
> > > > Your approach is aligned only to Intel , will not benefit others like 
> > > > openssl
> > > which is used by all
> > > > vendors.
> > >
> > > I feel quite opposite, from my perspective majority of SW backed PMDs will
> > > benefit from it.
> > > And I don't see anything Intel specific in my proposals above.
> > > About openssl PMD: I am not an expert here, but looking at the code, I 
> > > think it
> > > will fit really well.
> > > Look yourself at its internal functions:
> > > process_openssl_auth_op/process_openssl_crypto_op,
> > > I think they doing exactly the same - they use sync API underneath, and 
> > > they are
> > > session based
> > > (AFAIK you don't need any device/queue data, everything that needed for
> > > crypto/auth is stored inside session).


Looked at drivers/crypto/armv8 - same story here, I believe.

> > >
> > By vendor specific, I mean,
> > - no PMD would like to have 2 different variants of session Init APIs for 
> > doing the same stuff.
> > - stacks will become vendor specific while using 2 separate session create 
> > APIs. No stack would
> > Like to support 2 variants of session create- one for HW PMDs and one for 
> > SW PMDs.
> 
> I think what you refer on has nothing to do with 'vendor specific'.
> I would name it 'extra overhead for PMD and stack writers'.
> Yes, for sure there is extra overhead (as always with new API) -
> for both producer (PMD writer) and consumer (stack writer):
> New function(s) to support,  probably more tests to create/run, etc.
> Though this API is optional - if PMD/stack maintainer doesn't see
> value in it, they are free not to support it.
> From other side, re-using  rte_cryptodev_sym_session_init()
> wouldn't help anyway - both data-path and control-path would differ
> from async mode anyway.
> BTW, right now to support different HW flavors
> we do have 4 different control and data-paths for both
> ipsec-secgw and librte_ipsec:
> lkds-none/lksd-proto/inline-crypto/inline-proto.
> And that is considered to be ok.
> Honestly, I don't understand why SW backed implementations
> can't have their own path that would suite them most.
> Konstantin
> 
> 
> 
> 
>

Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API

Reply via email to