Hi Pekka,

 

Thanks for the comments on draft.

Please find my comments inline.

 

Regards,

Kalyani

 

-----Original Message-----
From: ipsec-boun...@ietf.org [mailto:ipsec-boun...@ietf.org] On Behalf
Of Pekka Riikonen
Sent: Friday, September 03, 2010 2:48 PM
To: y...@checkpoint.com
Cc: ipsec@ietf.org; kivi...@iki.fi
Subject: Re: [IPsec] Comments draft-kagarigi-ipsecme-ikev2-windowsync-04

 

I am a bit sceptical about the draft as it appears to be solving
something 

that doesn't have to be such a huge problem by introducing a new
exchange.

 

First, the ESP sequence number sync.  In case of failover the online
node 

should simply increment the sequence number with a large enough number; 

the anti-replay window can always move forward.  Implementations should 

also perform periodical sequence number sync in the cluster (fe. every
2-5 

seconds) to keep the numbers close enough in nodes.  I see no reason to 

sync this information between peers.  It will never work perfectly
anyway 

(there is too much traffic).  More frequently you sync the sequence 

numbers in the cluster smaller the increment needs to be (calculate 

expected pps).

 

Second, the message ID problem in failover is a real problem but isn't
the 

problem really the size of the message window?  If everyone would do 

SET_WINDOW_SIZE with large enough number (like 32), in failover we could


do something like next_message_id += window_size / 2, and be happy. 

Though, implementation must ensure it never sends more than the
increment 

(that's why window size of 1 doesn't work to begin with).  Why was the 

window size defined by default to 1 anyway?  Is there a reason why this 

wouldn't work? (SET_WINDOW_SIZE specifically allows us to move the 

window)

 

[KALYANI] We can always have the larger window size and send the new
request from failover device with the incremented message Id.

But the problem with this approach is, In case of windowing unless all
the messages are received 

with in the window range , the window never moves, hence the lost

Request will never be sent and eventually the sa will have to be
deleted, which can be avoided if this draft is implemented.

 

 

Any ongoing exchange at the time of the failover can be an issue (rare) 

but most can be eliminated with careful implementation.  For example,
the 

following:

 

  - Delete IKE SA or IPSEC SA

   -> Sync delete to cluster before sending packet to network.  Nodes
don't

      actually have to delete the SA, just mark it to be deleted.  This

      applies to both sending delete and receiving delete.

 

[KALYANI] If the message to delete the IKE SA is lost , then this would
make the active and failover device to be out-of-sync.

 

  - Rekey

   -> I don't see this as a problem.  New CHILD_SA or crash recovery
solves

      the problem either immediately or relatively quickly.

 

  It's impossible to make this work perfectly (machines can crash at any

  point), but important thing is that your implementation can recover

  (support crash recovery, do DPD when oddities occur).

 

[KALYANI] This draft proposes the synchronization of message Id's using
the IKE SA which is present on failover and peer devices.

In case of active member crash during IKE SA delete/rekey, the SA at
peer and failover device does not match(

 which means old sa is present on failover and new sa is present on
peer). IKE message Id synchronization is not meant to solve such issues.

 

I don't much see need for a new exchange, though a draft that explains 

best ideas for implementing clustering and HA would be nice.

 

      Pekka

_______________________________________________

IPsec mailing list

IPsec@ietf.org

https://www.ietf.org/mailman/listinfo/ipsec

_______________________________________________
IPsec mailing list
IPsec@ietf.org
https://www.ietf.org/mailman/listinfo/ipsec

Reply via email to