I am a bit sceptical about the draft as it appears to be solving something that doesn't have to be such a huge problem by introducing a new exchange.

First, the ESP sequence number sync. In case of failover the online node should simply increment the sequence number with a large enough number; the anti-replay window can always move forward. Implementations should also perform periodical sequence number sync in the cluster (fe. every 2-5 seconds) to keep the numbers close enough in nodes. I see no reason to sync this information between peers. It will never work perfectly anyway (there is too much traffic). More frequently you sync the sequence numbers in the cluster smaller the increment needs to be (calculate expected pps).

Second, the message ID problem in failover is a real problem but isn't the problem really the size of the message window? If everyone would do SET_WINDOW_SIZE with large enough number (like 32), in failover we could do something like next_message_id += window_size / 2, and be happy. Though, implementation must ensure it never sends more than the increment (that's why window size of 1 doesn't work to begin with). Why was the window size defined by default to 1 anyway? Is there a reason why this wouldn't work? (SET_WINDOW_SIZE specifically allows us to move the window)

Any ongoing exchange at the time of the failover can be an issue (rare) but most can be eliminated with careful implementation. For example, the following:

 - Delete IKE SA or IPSEC SA
  -> Sync delete to cluster before sending packet to network.  Nodes don't
     actually have to delete the SA, just mark it to be deleted.  This
     applies to both sending delete and receiving delete.

 - Rekey
  -> I don't see this as a problem.  New CHILD_SA or crash recovery solves
     the problem either immediately or relatively quickly.

 It's impossible to make this work perfectly (machines can crash at any
 point), but important thing is that your implementation can recover
 (support crash recovery, do DPD when oddities occur).

I don't much see need for a new exchange, though a draft that explains best ideas for implementing clustering and HA would be nice.

        Pekka
_______________________________________________
IPsec mailing list
IPsec@ietf.org
https://www.ietf.org/mailman/listinfo/ipsec

Reply via email to