Hi all, as I promised at the last IETF meeting, this is my review of the draft-pwouters-ipsecme-multi-sa-performance draft. This is not a formal review of the document, but rather some speculations on how the solution may be simplified. Sorry that it took so long and please consider this as an invitation for discussion.
I think that the performance problem is real and the document is a good starting point for solving it. That said, I think that the approach in the draft is a little bit too complicated. Since there are implementations of the draft out there, it is possible that I missed some details concerned with kernel internals limitations, but in my opinion the solution can be simplified. First, I think that the approach of having multiple IPsec SAs with identical selectors, each associated with its own CPU, is the right one for solving the performance problem. My main problem with the draft is the concept of "Fallback SA". This SA is treated specially in the draft, which I don't think is necessary. For example, it must always be up so that the outgoing packet can always be sent in case per-CPU SA does not exist. Why other existing per-CPU SAs cannot be used for this purpose? Another thing that I think is unnecessary is the CPU_QUEUES notify. Apart from indicating that the SA is a Fallback SA (which I think is not needed), I cannot see any usefulness of this notify. In particular - it contains the minimum number of parallel SAs peer wants to have - so for the receiving side its information makes no difference. For example, I want 3, you wants 5, we end up with 5 (as the bigger), but in fact I can always send TS_MAX_QUEUE after creating 3. No difference what values are indicated, they can be arbitrary. I'm also not convinced that CPU_QUEUE_INFO is really needed, it mostly exists for debugging purposes (again if we get rid of Fallback SA). And I don't think we need a new error notify TS_MAX_QUEUE, I believe TS_UNACCEPTABLE can be used instead. So, in my understanding, the architecture for multi-SA protocol could be as follows. An IPsec endpoint supporting this feature has several CPUs (or several cores, but let's call them CPUs). There is some mechanism that dispatches outgoing packets to different CPUs in some fashion (randomly or with round-robin algorithm or using some affinity). There also is some mechanism that deterministically dispatches incoming ESP packets to CPUs based on some information from the packets, most probably from the content of SPI. With these kernel features in mind the following IPsec architecture could be implied. The SPD is global for all CPUs, while SAD is split into several copies, so that each CPU has its own SAD. We also need to introduce a special entry in the SAD - "stub SA", that only constitutes of a selector and has no associated SA. When there is an outgoing packet on the initiator, then it is handled by one of CPUs. This CPU checks its own SAD and founds no SA that matches packet selector, so it checks the SPD and finds a rule saying that this packets with this selector must be protected with ESP. Then this CPU requests IKE for creating the ESP SA. IKE performs the needed actions and as a result it creates a pair of ESP SAs. This is all usual actions with no deviation from any ordinary IPsec implementation. The difference is that then IKE installs this pair of SAs only to the SAD for that very CPU that requested its creation. Note, that an SPI for the incoming ESP SA should be selected in such a way, that the mechanism steering incoming packets to an appropriate CPU must correctly steer this SPI to the CPU that this SA is installed for. All other SADs (for the rest CPUs) are populated with a stub SA entry, having the same selector and a pointer to the CPU that have real SA installed. The same actions are performed by the responder. At this point we have a pair of SAs associated with one CPU both on the initiator and on the responder. If one of the next outgoing packets is handled by a different CPU, then this CPU will first check whether an appropriate SA exists in its SAD and will found the stub SA entry with a matching selector. Since there is no SA, this CPU requests IKE to create one for it. Meanwhile, the packet triggered this action can be 1) dropped 2) retained waiting for the SA to be ready or 3) re-steered to the CPU that already has an appropriate SA (it is indicated in the stub entry). When a new pair of SAs is ready, it is installed on the CPU that triggered its creation replacing the stub entry. Note, that the responder should detect, that it already has SAs with the same traffic selector for at least one of its CPUs and should try to install new SAs on a different CPU. Note, that with stub entries there is no difference which side initiates creating of additional SAS - it can be the original responder too. This way the new SAs are created dynamically and treated equally - they all live their own life - are re-keyed or even deleted if they are idle for a long time. When the SA is being deleted, all other CPU SADs should be checked whether any SA with the same traffic selector exists. If it is, then the deleted SA is replaced with a stub entry with indication of a CPU that has real SA. If not (only stubs exists), then all stubs from all SADs are deleted. With this approach there is no need to indicate (or negotiate) the number of parallel SAs: generally with sufficient traffic both sides will try to populate all its CPU SADs (there also may be some affinity in the steering algorithm). If the party responding in the CREATE_CHILD_SA exchange is clever enough and spread the incoming ESP SAs over its CPUs, then in the ideal case the peers end up with the number of parallel SAs equal to the greater number of CPUs (e.g. if one peer has 6 CPUs and the other 8, they will create 8 parallel SAs, where the party having 6 CPUs will have duplicate SAs on 2 CPUs). It is also possible that at some point one of the peers rejects creating additional SAs. It can indicate this by returning TS_UNACCEPTABLE notification (or NO_ADDITIONAL_SAS, but see below). If the peer returns TS_UNACCEPTABLE for a request, then the corresponding CPU SAD on the initiator will not be populated with an SA. Note, that this SAD contains stub entry (since it triggers creating additional SA). So, in this case this stub entry is retained, but a flag is set in it that no requests for creating ESP SA should be made for this stub. Instead, any future packet matching this stub entry that is handled by this CPU should be re-steered to a different CPU that has a real SA (this CPU it us indicated in the stub). With this approach no new notifications are needed, thus it provides full compatibility with unsupporting implementations (with a difference that unsupporting implementations will most probably send NO_ADDITIONAL_SAS when they are unable to install more SAs). Of course, since unsupporting implementations will not spread multiple SAs evenly over its CPUs, the resulted configuration would be sub-optimal, but still will provide some performance improvement. Any opinions? Regards, Valery. _______________________________________________ IPsec mailing list IPsec@ietf.org https://www.ietf.org/mailman/listinfo/ipsec