Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance

Valery Smyslov Tue, 18 Oct 2022 06:38:23 -0700

Hi Paul,

> On Mon, 17 Oct 2022, Valery Smyslov wrote:
> 
> > implementation with say 10 CPUs. Does it make any difference for this 
> > implementation
> > If it receives CPU_QUEUES with 100 or with 1000? It seems to me that in 
> > both cases
> > it will follow its own local policy for limiting the number of per-CPU SAs,
> > most probably capping it to 10.
> 
> That would be a mistake. You always want to allow a few more than the
> CPUs you have. The maximum is mostly to protect against DoS attacks.


How it protects against DoS attacks, can you elaborate?

> If you only have 10 CPUs, but the other end has 50, there shouldn't
> be much issue to install 50 SA's. Not sure if we said so in the draft,

I'm not so sure. For example, if you use HSM, you are limited by its 
capabilities.

> but you could even omit installing 40 of the outgoing SA's since you
> would never be expected to use them anyway. but you have to install all
> 50 incoming ones because the peer might use them.

And what to do in situations you are unable to install all 50 (for any reason)?
And how it is expected to deal with situations when the number of CPUs
is changed over the lifetime of IKE SA? As far as I understand some modern
systems allows adding CPUs or removing them on the fly.

> > Or do you want to say that the logic would
> > be: well my peer has 1000 CPUs, it's not good for me to have more than 10,
> > but let's be friendly and install 100, so that both of us are suffer...
> 
> At that point, it really also matters what the differences are between
> the CPUs. An embedded device with 4 cpus at 800mhz vs a mega supercomputer
> with 250 5ghz cpus. Already, counting CPUs is an approximation. Already,
> the application needs proper threading to use multiple CPUs without a
> single CPU bottleneck at the application layer. There might very well be
> situations where multi-sa doesn't really help you at all.

Yes.

> But the simple use case is clear. I have a database cluster of 10 nodes
> talking to eachother with lots of CPUs and most are idle because I have
> one fully loaded CPU because my SA is bound to one CPU only.

The use case is clear. And the idea to have per-CPU SAs is clear too.
The problem (my problem) is the way how it is achieved.

> > I don't think this logic is credible in real life, but even in this case
> > there is already a mechanism that allows to limit the number of
> > per-CPU SAs - it is the TS_MAX_QUEUE notify.
> 
> > So why we need CPU_QUEUES?
> 
> TS_MAX_QUEUE is conveying an irrecoverable error condition. It should
> never happen.

That's not what the draft says:

   The responder may at any time reject
   additional Child SAs by returning TS_MAX_QUEUE.  

So, my reading is that this notify can be sent at any time if peer 
is not willing to create more per-CPU SAs. And sending this notify
doesn't cause deletion of IKE SA and all its Child SAs (it is my guess).

If my reading is wrong and this is a fatal error (or what do you mean by " 
irrecoverable "?), 
then the protocol is worse than I thought for devices that for any reason 
cannot afford
installing unlimited number of SAs (e.g. if they use HSM with
limited memory). In this case they cannot even tell the peer 
that they have limited resources.

> Where as CPU_QUEUES tells you how many per-CPU child SAs
> you can do. This is meant to reduce the number of in-flight CREATE_CHILD_SA's
> that will never become successful.

It seems to me that it's enough to have one CREATE_CHILD_SA with the proper
error notify to indicate that the peer is unwilling to create more SAs.
I'm not sure this is a big saving.

> >>> I'm also not convinced that CPU_QUEUE_INFO is really needed, it mostly 
> >>> exists
> >>> for debugging purposes (again if we get rid of Fallback SA). And I don't 
> >>> think we need
> >>> a new error notify TS_MAX_QUEUE, I believe TS_UNACCEPTABLE can be used 
> >>> instead.
> 
> We did it to distinquish between "too many of the same child sa" versus
> other errors in cases of multiple subnets / child SAs under the same IKE
> peer. Rethinking it, I am no longer able to reproduce why we think it
> was required :)

I believe TS_UNACCEPTABLE is well suited for this purpose. You know for sure 
that TS itself is OK,
since you have already installed SA(s) with the same TS, and it's not fatal 
error notify and 
is standardized in RFC 7296 and it does not prevent creating SAs with other TS.

> >> On a preemptive system, the scheduler might migrate applications
> >> from one cpu to another from time to time. So 1 and 2 are IMO not
> >> appropriate as the application would stuck until a SA is created.
> >> 3 has its own problems as discussed in the other mail.
> >
> > OK, but also see my considerations there.
> 
> The idea of the fallback SA is that you always have at least one child
> SA guaranteed to be up that can encrypt and send a packet. It can be
> installed to not be per-CPU. It's a guarantee that you will never need
> to wait (and cache?) 1 RTT's time worth of packets, which can be a lot
> of packets. You don't want dynamic resteering. Just have the fallback
> SA "be ready" in case there is no per-cpu SA.

The drawback of the Fallback SA is that it needs a special processing.
Normally we delete SAs when they are idle for a long time
to conserve resources, but the draft says it must not be done with the Fallback 
SA.

> > I think it depends. I'd like to see optimization efforts to influence
> > the protocol as less as possible. Ideally this should be local matter
> > for implementations. This would allow them to interoperate
> > with unsupporting implementations (and even to benefit from
> > multi-SAs even in these situations).
> 
> Those that don't support this don't see notifies ? Or do you mean to
> somehow install multiple SA's for the same thing on "unsupported"
> systems? 

Yes. The idea is that If one peer supports per-CPU SAs and the 
other doesn't, they still be able to communicate and have multiple SAs.
For example, if the supporting system has several weak CPUs,
while the unsupporting one has much more powerful CPU,
then multiple SAs will help to improve performance - 
the supporting system will distribute load on its weak CPUs,
while for unsupporting the load will be small enough even for a single CPU.

> The problem currently is that when an identical child SA
> is successfully negotiated, implementations differ on what they do.
> Some allow this, some delete the older one. The goal of this draft
> is to make the desire for multple idential child SAs very explicit.

RFC 7296 explicitly allows multiple Child SAs with identical selectors,
so if implementations immediately delete them, then they are either broken
or have reasons to do it (e.g. have no resources).

Regards,
Valery.

> Paul

_______________________________________________
IPsec mailing list
IPsec@ietf.org
https://www.ietf.org/mailman/listinfo/ipsec

Re: [IPsec] Discussion of draft-pwouters-ipsecme-multi-sa-performance

Reply via email to