Hi Neale, Thank you for your comments. I know you would have thought about it already. I can work with you to implement the right solution to improve performance. Please see my response inline.
Thanks Govind From: Neale Ranns <ne...@graphiant.com> Sent: Wednesday, March 3, 2021 8:45 AM To: Govindarajan Mohandoss <govindarajan.mohand...@arm.com>; vpp-dev <vpp-dev@lists.fd.io> Cc: nd <n...@arm.com> Subject: Re: [vpp-dev] IPSec proposal to improve "ipsec4-output-feature" node performance Hi Govind, Flow caches always perform well, but they are more difficult to use than they first appear. Consider asking yourself these questions: 1 - how many entries can the cache contain? >> This can be made configurable as per the system need. By default, we can >> allocate the hash table size to hold 10K entries. 2 - what do you do when the cache is full? How do you age or recycle old flows? >> If the flow cache is implemented using a hash table without collision >> handling, then age out mechanism is not needed. Whenever a collision occurs, old entry can be overwritten with new entry. Worst case will be 255 overwrites, if all the 256 packets per batch result in same hash value. 3 - how do you flush the cache when the policy set changes? >> Whenever an SPD rule is deleted, the flow cache will be flushed completely >> in the control plane. An IPSec module level flag will be introduced and set >> by the control plane to put the data plane in fall back mode to use linear search. This flag will be reset once the control plane flush the flow cache and delete the SPD rule from SPD table. Also, data plane will not add new entry into the flow cache during SPD rule deletion. I have added this logic in my prototype. Please find the changes attached. In general, what is the rate at which an SPD rule will be deleted by the application ? If the deletion rate is low, then we can take the hit of flushing the flow cache in control plane. I had considered in the past changing an SPD definition to use IP subnets (rather than IP ranges) and then re-use the tuple-sort/merge algorithm used by ACLs. This approach would not need you to answer the awkward questions about a cache and it would break the linear dependent lookup (it has other dependencies, but they are much better). Two reasons I didn't do this 1) no time 2) ipsec is a vnet component and ACL is a plugin, a vnet -> plugin dependency is a no-no. If you're lucky some-one might volunteer to make IPsec a plugin and this will go away... >> Please correct my understanding. In this method, the mask have to be created for every SPD rule and stored in an array. On every packet arrival, the mask will be picked up in linear fashion and hash will be computed based on mask and packet header fields. Then bihash will be looked up with that hash value. This reduces the overhead of comparing the ranges during linear search. But the mask lookup is still linear. I agree that there will be a performance improvement because the range comparison is avoided for every SPD entry. Is there a way to implement it without creating IPSec plugin and without depending on ACL plugin ? /neale From: vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io> <vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> on behalf of Govindarajan Mohandoss via lists.fd.io <Govindarajan.mohandoss=arm....@lists.fd.io<mailto:Govindarajan.mohandoss=arm....@lists.fd.io>> Date: Wednesday, 3 March 2021 at 06:57 To: vpp-dev <vpp-dev@lists.fd.io<mailto:vpp-dev@lists.fd.io>> Cc: nd <n...@arm.com<mailto:n...@arm.com>> Subject: [vpp-dev] IPSec proposal to improve "ipsec4-output-feature" node performance Hi Neale, I am working on optimizing "ipsec4-output-feature" node on ARM based systems. Towards that, I saw an opportunity to supplement SPD table lookup (linear search) with Bihash based flow cache. This approach is similar to ACL plugin stateful mode implementation. This approach will consume extra memory for Bihash and provide O(1) performance for SPD rules added at different indices. I did a very basic prototype and got good results. Please find the prototype patch attached. Before I start the actual implementation, I would like to get your feedback. It will be great if you can give your comments. Following is the idea at high level. Flow cache will be augmented with existing linear search based SPD table lookup. Enhanced SPD Table lookup logic: --------------------------------------------- One every packet arrival, following lookup will be done in "ipsec4-output-feature" node: 1. found = Lookup <5 tuple: Bihash based flow cache> 2. if (!found) { found = Lookup <5 tuple: Linear search> if (found) { Add an entry into <5 tuple: Bihash based flow cache> } } Linear search will happen only for 1st packet in a flow and from 2nd packet onwards, match will succeed in bihash table. I did a basic prototype and got O(1) performance as expected, when IPv4 5 tuple rule is added at different indices <1, 10, 100, 1000> in SPD table. Following are the per core performance numbers with IPSec NULL encryption configuration in ESP Tunnel mode, in ARM CPU based system @MRR with 64B packets: Baseline based on existing linear search ================================ SPD index Performance ----------------------------------- 1st match 5.2 MPPS 10th match 4.51 MPPS 100th match 2.05 MPPS 1000th match 266 KPPS With Bihash based flow cache (Basic prototype results): ============================================== SPD index Performance ----------------------------------- 1st match 4.88 MPPS 10th match 4.88 MPPS 100th match 4.88 MPPS 1000th match 4.88 MPPS As you can see, we are getting constant performance numbers even when rules are added at different indices. If you are fine with this approach, I would like to proceed with actual implementation. I am making an assumption that SPD table will not be populated frequently by the application. Please correct me if I am wrong. Whenever application add/delete/modify an entry in SPD table, flow cache will be purged in the data plane through an interface level flag. I will work on this case and send another update. Thanks Govind
spd_with_flow_cache_prototype_v1.diff
Description: spd_with_flow_cache_prototype_v1.diff
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#18845): https://lists.fd.io/g/vpp-dev/message/18845 Mute This Topic: https://lists.fd.io/mt/81046304/21656 Group Owner: vpp-dev+ow...@lists.fd.io Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-