Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance

Steffen Klassert Fri, 14 Oct 2022 04:16:08 -0700

Hi Valery,

thanks for yor feedback! Some comments inline.


On Tue, Oct 11, 2022 at 05:37:29PM +0300, Valery Smyslov wrote:
> Hi all,
> 
> as I promised at the last IETF meeting, this is my review of the 
> draft-pwouters-ipsecme-multi-sa-performance draft.
> This is not a formal review of the document, but rather some speculations on 
> how the solution may be simplified.
> Sorry that it took so long and please consider this as an invitation for 
> discussion.
> 
> I think that the performance problem is real and the document is a good 
> starting point for solving it.
> That said, I think that the approach in the draft is a little bit too 
> complicated. Since there are implementations
> of the draft out there, it is possible that I missed some details concerned 
> with kernel internals limitations, 
> but in my opinion the solution can be simplified.
> 
> First, I think that the approach of having multiple IPsec SAs with identical 
> selectors, each associated with 
> its own CPU, is the right one for solving the performance problem. 
> 
> My main problem with the draft is the concept of "Fallback SA". This SA is 
> treated specially in the draft,
> which I don't think is necessary. For example, it must always be up so that 
> the outgoing packet can
> always be sent in case per-CPU SA does not exist. Why other existing per-CPU 
> SAs cannot be used
> for this purpose?

Argued about that in my other mail.

> Another thing that I think is unnecessary is the CPU_QUEUES notify.
> Apart from indicating that the SA is a Fallback SA (which I think is not 
> needed), I cannot
> see any usefulness of this notify. In particular - it contains the minimum 
> number of parallel
> SAs peer wants to have - so for the receiving side its information makes no 
> difference.
> For example, I want 3, you wants 5, we end up with 5 (as the bigger), but in 
> fact I can always send
> TS_MAX_QUEUE after creating 3. No difference what values are indicated, they 
> can be arbitrary.

Sure, you can do that. The intention behind this was to decide if both
peers can benefit from doing a pcpu SA setup. Consider a setup where
one peer has 1000 cpus and the other peer has just 2. If the one with
1000  cpus tries to install a SA for each cpu, the other ends SAD lookup
becomes very inefficient. On the other hand, each peer needs to be able
to install at least as many SAs as it has cpus. Otherwise some cpus
have to use always the fallback SA or need to re-steer flows to
other cpus. That is inefficient too, so there should be a way to
detect if the difference is too big to use pcpu SAs efficent for
both sides. So if the other end asks for a too big CPU_QUEUES
number you can just say, no let's just use one SA for all the traffic.

> I'm also not convinced that CPU_QUEUE_INFO is really needed, it mostly exists 
> for debugging purposes (again if we get rid of Fallback SA). And I don't 
> think we need
> a new error notify TS_MAX_QUEUE, I believe TS_UNACCEPTABLE can be used 
> instead.

Ok.

> So, in my understanding, the architecture for multi-SA protocol could be as 
> follows.
> An IPsec endpoint supporting this feature has several CPUs (or several cores, 
> but let's call them CPUs). 
> There is some mechanism that dispatches outgoing packets to different CPUs in 
> some fashion 
> (randomly or with round-robin algorithm or using some affinity). There also 
> is some mechanism that 
> deterministically dispatches incoming ESP packets to CPUs based on some 
> information 
> from the packets, most probably from the content of SPI.
> 
> With these kernel features in mind the following IPsec architecture could be 
> implied.
> The SPD is global for all CPUs, while SAD is split into several copies, so 
> that each CPU has 
> its own SAD. We also need to introduce a special entry in the SAD - "stub 
> SA", that 
> only constitutes of a selector and has no associated SA.
> 
> When there is an outgoing packet on the initiator, then it is handled by one 
> of CPUs.
> This CPU checks its own SAD and founds no SA that matches packet selector, so 
> it checks the SPD and finds a rule saying that this packets with this selector
> must be protected with ESP. Then this CPU requests IKE for creating the ESP 
> SA.
> IKE performs the needed actions and as a result it creates a pair of ESP SAs.
> This is all usual actions with no deviation from any ordinary IPsec 
> implementation.
> 
> The difference is that then IKE installs this pair of SAs only to the SAD for 
> that very 
> CPU that requested its creation. Note, that an SPI for the incoming ESP SA
> should be selected in such a way, that the mechanism steering incoming packets
> to an appropriate CPU must correctly steer this SPI to the CPU that this SA 
> is installed for.
> All other SADs (for the rest CPUs) are populated with a stub SA entry, having 
> the 
> same selector and a pointer to the CPU that have real SA installed.

That would require a write to all remote pcpu SADs, so the percpu SADs
can't be lockless.

> The same actions are performed by the responder. At this point we have
> a pair of SAs associated with one CPU both on the initiator and on the 
> responder.
> 
> If one of the next outgoing packets is handled by a different CPU, then 
> this CPU will first check whether an appropriate SA exists in its SAD 
> and will found the stub SA entry with a matching selector. Since
> there is no SA, this CPU requests IKE to create one for it. Meanwhile,
> the packet triggered this action can be 1) dropped 2) retained waiting for 
> the SA to be ready
> or 3) re-steered to the CPU that already has an appropriate SA (it is 
> indicated in the stub entry).

On a preemptive system, the scheduler might migrate applications
from one cpu to another from time to time. So 1 and 2 are IMO not 
appropriate as the application would stuck until a SA is created.
3 has its own problems as discussed in the other mail.

> When a new pair of SAs is ready, it is installed on the CPU that triggered 
> its creation
> replacing the stub entry. Note, that the responder should detect, that it 
> already has SAs 
> with the same traffic selector for at least one of its CPUs and should try to 
> install new SAs 
> on a different CPU. Note, that with stub entries there is no difference
> which side initiates creating of additional SAS - it can be the original 
> responder too.
> 
> This way the new SAs are created dynamically and treated equally - they all 
> live
> their own life - are re-keyed or even deleted if they are idle for a long 
> time.
> When the SA is being deleted, all other CPU SADs should be checked whether
> any SA with the same traffic selector exists. If it is, then the deleted SA 
> is replaced
> with a stub entry with indication of a CPU that has real SA. If not (only 
> stubs exists),
> then all stubs from all SADs are deleted.
> 
> With this approach there is no need to indicate (or negotiate) the number of
> parallel SAs: generally with sufficient traffic both sides will try to 
> populate
> all its CPU SADs (there also may be some affinity in the steering algorithm).
> If the party responding in the CREATE_CHILD_SA exchange is clever enough 
> and spread the incoming ESP SAs over its CPUs, then in the ideal case 
> the peers end up with the number of parallel SAs equal to the greater
> number of CPUs (e.g. if one peer has 6 CPUs and the other 8, they 
> will create 8 parallel SAs, where the party having 6 CPUs will have 
> duplicate SAs on 2 CPUs). It is also possible that at some point one of the 
> peers
> rejects creating additional SAs. It can indicate this by returning 
> TS_UNACCEPTABLE notification (or NO_ADDITIONAL_SAS, but see below).
> If the peer returns TS_UNACCEPTABLE for a request, then the corresponding
> CPU SAD on the initiator will not be populated with an SA. Note, that
> this SAD contains stub entry (since it triggers creating additional SA).
> So, in this case this stub entry is retained, but a flag is set in it that no 
> requests
> for creating ESP SA should be made for this stub. Instead, any future packet
> matching this stub entry that is handled by this CPU should be re-steered
> to a different CPU that has a real SA (this CPU it us indicated in the stub). 
> 
> With this approach no new notifications are needed, thus it provides full
> compatibility with unsupporting implementations  (with a difference
> that unsupporting implementations will most probably send NO_ADDITIONAL_SAS
> when they are unable to install more SAs). Of course, since unsupporting
> implementations will not spread multiple SAs evenly over its CPUs,
> the resulted configuration would be sub-optimal, but still will provide
> some performance improvement.

Whatever we do, I think both ends should at least know that we do this.
Otherwise you configure your system based on some guesses what the
other end will do.

Steffen

_______________________________________________
IPsec mailing list
IPsec@ietf.org
https://www.ietf.org/mailman/listinfo/ipsec

Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance

Reply via email to