> > > > > > > The SA lookup logic and management is purely requirement based > > > > > > > for the > > > > > > application. > > > > > > >The application may only cater to <128 SAs which can > > > > > > > be handled based on the current logic. > > > > > > > > > > > > Not always, current implementation can handle < 128 SA, > > > > > > whose SPI%128 never match (let say it cant't handle SPI=1 and > > > > > > SPI=129). > > > > > > Yes, what we have right now has nearly zero overhead, > > > > > > and might be ok for some really simple show-cases. > > > > > > But for majority of production IPsec implementations, > > > > > > I believe that definitely wouldn't be enough. > > > > > > > > > > > > > –single-sa option cannot handle this. > > > > > > > Sample applications in DPDK are there to showcase the best a > > > > > > > hardware > > > can > > > > > > deliver. > > > > > > > > > > > > My thought was - that's the reason we have single-sa option - > > > > > > demonstrate best possible HW perf without minimal SW intervention. > > > > > > For something more serious than that, we use generic SAD > > > > > > implementation. > > > > > > > > > > > > > IMO, we cannot allow this logic on NXP hardwares. We > > > > > > > give performance numbers based on IPSec app to customers and we > > > cannot > > > > > > allow 15% degradation. > > > > > > > > > > > > As Vladimir said, we are looking how to improve current SAD numbers > > > > > > and minimize the drop. > > > > > > But with same equals - plain array will always be faster than hash > > > > > > table, > > > > > > so not sure we will be able to match existing performance. > > > > > > So two questions: > > > > > > 1. What exact case you use for perf testing > > > > > > (total number of SAs, packets per burst belong to the > > > > > > same/different SAs)? > > > > > > Might be there is a way to speedup it. > > > > > > Again if 10-15% is not an affordable drop, which one is: zero > > > > > > or ...? > > > > > > > > > > We should add features judiciously, we cannot drop the performance of > > > > > a > > > > > benchmarking > > > > > Application in lieu of adding functionality. We should only add > > > > > features which > > > > > are not > > > > > Impacting the performance significantly. > > > > > Every vendor may have different cases. We cannot tune for everybody. > > > > > However, I see drop in 64 outbound 64 inbound SAs all with different > > > > > SPI and > > > IPs. > > > > > Packets per burst = 32 all with different SAs. > > > > > > > > > > > > > We can have two modes of lookup similar to l3fwd - EM and LPM. > > > > LPM is O(1) while EM is more realistic. Similar logic can be added here > > > > as well. > > > > With L3fwd also we showcase performance for best case(lpm) and the worst > > > case(em) > > > > What Say? > > > > > > We discussed it off-line with Vladimir and came up with similar idea: > > > Have a proper/generic SAD implementation and add limited size plain-array > > > on top of it as 1xway associative cache. > > > So for the case when all active SAs fit into the cache and no SPI > > > collisions, > > > we should have same performance as now (with plain array). > > > From other side, we'll still have generic/scalable/rfc compliant > > > implementation. > > > Sort of best sides from two words. > > > Plans are to submit v4 with such approach in next few days. > > > > OK lets check the v4 before moving the discussion to techboard. > > @Thomas: Do you have more thoughts on this? Should we get it added in the > > agenda > > Or wait for the v4? > > If v4 is good for both cases, it lowers the priority of the discussion. > > But still, it would be interesting to state the objectives of the examples: > - show API usage? > - show feature performance? > - show best hardware performance? > - what else?
Agree, that’s a good topic to discuss.