> > > > > > > The SA lookup logic and management is purely requirement based 
> > > > > > > for the
> > > > > > application.
> > > > > > >The application may only cater to <128 SAs which can
> > > > > > > be handled based on the current logic.
> > > > > >
> > > > > > Not always, current implementation can handle < 128 SA,
> > > > > > whose SPI%128 never match (let say it cant't handle SPI=1 and 
> > > > > > SPI=129).
> > > > > > Yes, what we have right now has nearly zero overhead,
> > > > > > and might be ok for some really simple show-cases.
> > > > > > But for majority of production IPsec implementations,
> > > > > > I believe that definitely wouldn't be enough.
> > > > > >
> > > > > > > –single-sa option cannot handle this.
> > > > > > > Sample applications in DPDK are there to showcase the best a 
> > > > > > > hardware
> > > can
> > > > > > deliver.
> > > > > >
> > > > > > My thought was - that's the reason we have single-sa option -
> > > > > > demonstrate best possible HW perf without minimal SW intervention.
> > > > > > For something more serious than that, we use generic SAD 
> > > > > > implementation.
> > > > > >
> > > > > > > IMO, we cannot allow this logic on NXP hardwares. We
> > > > > > > give performance numbers based on IPSec app to customers and we
> > > cannot
> > > > > > allow 15% degradation.
> > > > > >
> > > > > > As Vladimir said, we are looking how to improve current SAD numbers
> > > > > > and minimize the drop.
> > > > > > But with same equals - plain array will always be faster than hash 
> > > > > > table,
> > > > > > so not sure we will be able to match existing performance.
> > > > > > So two questions:
> > > > > > 1. What exact case you use for perf testing
> > > > > >     (total number of SAs, packets per burst belong to the 
> > > > > > same/different SAs)?
> > > > > >     Might be there is a way to speedup it.
> > > > > >     Again if 10-15% is not an affordable drop, which one is: zero 
> > > > > > or ...?
> > > > >
> > > > > We should add features judiciously, we cannot drop the performance of 
> > > > > a
> > > > > benchmarking
> > > > > Application in lieu of adding functionality. We should only add 
> > > > > features which
> > > > > are not
> > > > > Impacting the performance significantly.
> > > > > Every vendor may have different cases. We cannot tune for everybody.
> > > > > However, I see drop in 64 outbound 64 inbound SAs all with different 
> > > > > SPI and
> > > IPs.
> > > > > Packets per burst = 32 all with different SAs.
> > > > >
> > > >
> > > > We can have two modes of lookup similar to l3fwd - EM and LPM.
> > > > LPM is O(1) while EM is more realistic. Similar logic can be added here 
> > > > as well.
> > > > With L3fwd also we showcase performance for best case(lpm) and the worst
> > > case(em)
> > > > What Say?
> > >
> > > We discussed it off-line with Vladimir and came up with similar idea:
> > > Have a proper/generic SAD implementation and add limited size plain-array
> > > on top of it as 1xway associative cache.
> > > So for the case when all active SAs fit into the cache and no SPI 
> > > collisions,
> > > we should have same performance as now (with plain array).
> > > From other side, we'll still have generic/scalable/rfc compliant 
> > > implementation.
> > > Sort of best sides from two words.
> > > Plans are to submit v4 with such approach in next few days.
> >
> > OK lets check the v4 before moving the discussion to techboard.
> > @Thomas: Do you have more thoughts on this? Should we get it added in the 
> > agenda
> > Or wait for the v4?
> 
> If v4 is good for both cases, it lowers the priority of the discussion.
> 
> But still, it would be interesting to state the objectives of the examples:
>       - show API usage?
>       - show feature performance?
>       - show best hardware performance?
>       - what else?

Agree, that’s a good topic to discuss.

Reply via email to