Re: [IPsec] AD Review of draft-ietf-ipsecme-rfc8229bis-05

2022-04-27 Thread Valery Smyslov
HI Roman,

thank you for the review. Please see comments inline.

> Hi!
> 
> I performed a AD review of draft-ietf-ipsecme-rfc8229bis-05.  Thanks for 
> revising RFC8229 with this new
> guidance.  Comments are below:
> 
> ** The abstract notes that many of the document updates came from deployment 
> experience.  I'm hoping to
> incorporate that feedback on a particular issue.  There are a number places 
> in this document where
> qualitative recommendations are made about various network stack timers.  Can 
> quantitative
> recommendations be made in any of the following:

Traditionally, IPsec specifications contain very few quantitatives concerning 
various timings.
This is due to the belief that concrete timeouts don't affect interoperability.
Instead, some very generic recommendations are usually given. 
See for example Section 2.4 of RFC 7296:

   The number of retries and length of timeouts are not covered in this
   specification because they do not affect interoperability.  It is
   suggested that messages be retransmitted at least a dozen times over
   a period of at least several minutes before giving up on an SA, but
   different environments may require different rules.

> -- Section 7.1  "If the TCP connection is no more associated with any active 
> IKE SA, the TCP Responder MAY
> close the connection to clean up resources if TCP Originator didn't close it 
> within some reasonable period of
> time."

  I don't think we should prescribe concrete time to wait (since it is a 
Responder's matter when to free up its resources),
  but we can add a recommendation. How about:

If the TCP connection is no more
associated with any active IKE SA, the TCP Responder MAY close the
connection to clean up resources if TCP Originator didn't close it
within some reasonable period of time (e.g. few seconds).

The reason for keeping the orphan TCP connection for some short time is to allow
the Initiator to re-use it in case it is ever possible. For example, if the 
responder returned
an error notify and deletes the IKE SA, but the initiator is able to recover 
(e.g.
after COOKIE request or INVALID_KE_PAYLOAD) then if the Responder immediately
closes TCP connection, then the Initiator will have to re-establish it, thus 
wasting 2 RTT.
So, this is just for optimization, nothing fatal happens if the responder closes
orphan TCP connection immediately.

> -- Section 7.4. "In particular, it is advised that the Initiator should not 
> act immediately after receiving error
> notification and should instead wait some time for valid response, ..."

  This text is just a repetition of what RFC 7296 contains (Section 2.21.1). 
This specification recommends not to 
  follow RFC 7296 in this situation and act upon immediately if error 
notification is received.

> -- Section 8.1. "If no response is received within a certain period of time 
> after several retransmissions ..."

  It's hard to give any concrete recommendations here. If the initiator 
switches to TCP too quickly,
  then it may end up with TCP transport while UDP is available on this path. 
This is suboptimal.
  On the other hand, if it waits too long before switching to TCP in situation 
when UDP doesn't work, 
  then it makes the connection outage longer. How about adding the following 
sentence:

The value of timeout and the number of retransmissions may vary 
depending on the 
 initiator's configuration, but it is expected that the initiators 
would try to 
get response over UDP for at least half a minute sending at least dozen 
retransmissions
before switching to TCP.

  What WG members think about these values?

> -- Section 8.4. "For the client, the cluster failover event may remain 
> undetected for long time  if it has no IKE
> or ESP traffic to send. "

  Hm, I'm a bit confused what quantitative do you want to see here. It is just 
an ascertaining
  that as long as no traffic originates from the client to the cluster then the 
fact that 
  the failover takes place will not be known to the client (in case of TCP).

> -- Section 8.4.  "if support for High Availability in IKEv2 is negotiated and 
> TCP transport is used, a client that is
> a TCP Originator  SHOULD periodically  send IKEv2 messages (e.g. by 
> initiating liveness check exchange)
> whenever thereis no IKEv2 or ESP traffic."

  Again, it's hard to give concrete recommendations. All depends on the 
client's policy.
  If it wants to minimize the delay it detects the cluster failover, then it 
would send
  liveness check messages more frequently. On the other hand, if it wants to 
save
  resources, it would send them less frequently. I don't think any "one size 
fits all"
  recommendation can be given.

> The only place I found quantitative guidance was in Section 7.3.1.
> 
> ** Section 6.1.  Editorial.  s/with new Initiators's SPI/with the Initiator's 
> new SPI/
> 
> ** Section 7.1  Editorial.
> 
> OLD
> If the TCP connection

Re: [IPsec] AD Review of draft-ietf-ipsecme-rfc8229bis-05

2022-04-27 Thread Tero Kivinen
Valery Smyslov writes:
> Traditionally, IPsec specifications contain very few quantitatives
> concerning various timings. This is due to the belief that concrete
> timeouts don't affect interoperability. Instead, some very generic
> recommendations are usually given. See for example Section 2.4 of
> RFC 7296:
> 
>The number of retries and length of timeouts are not covered in this
>specification because they do not affect interoperability.  It is
>suggested that messages be retransmitted at least a dozen times over
>a period of at least several minutes before giving up on an SA, but
>different environments may require different rules.

The reason why we do not give quantitative timings is that they differ
so much depending on the environment.

In the site to site VPN the initial retransmission timer of 500 ms and
then doubling from there up to 4 seconds and then timing out after 30
seconds is fine. On road warrior VPN users in the hotel network where
first packet exchange might actually trigger login to hotel network or
similar requiring password needs much longer timers.

Then when using EAP where users might actually need to type in
password, pin codes or similar requires even longer timers (usually 10
times longer timers than for site to site VPN cases).

Also when using mobike the timers needed are also much longer, as it
usually takes some time to detect that packets do not go through using
old interface, and then in some cases switching to another interface
(i.e. 4g or similar), might require bringing that interface up and
only after that it can even try to connect through that new interface.

In implemenations using mobike, the timers are also about 10 times
longer than when using site to site vpn (i.e., the exchange will
expire only after 5 minutes or so).

>   What WG members think about these values?

I would still leave out the values, and just use vague
recommendations. That causes the implementor to actually think himself
in which kind of environment the implementation is aimed for and
adjust timers accordingly (and usually also make those timers to be
configurable, i.e., implementor usually just set defaults).

> > ** Section 7.3
> >*  the exchange Responder SHOULD NOT request a Cookie, with the
> >   exception of Puzzles or in rare cases like preventing TCP Sequence
> >   Number attacks.
> > 
> > I'm having trouble following this guidance. Is this saying "you
> > SHOULD NOT send IKEv2 Cookies without Puzzles?". If so, is this
> > the intent:
> 
>   Yes.

Actually the RFC8019 puzzles requires that IKEv2 Cookies are used, as
the puzzle solution is PRF calculations over the IKEv2 cookie. 
> 
> > ** Section 8.4.
> > This document makes the following recommendation: if support for High
> >Availability in IKEv2 is negotiated and TCP transport is used, a
> >client that is a TCP Originator SHOULD periodically send IKEv2
> >messages (e.g. by initiating liveness check exchange) whenever there
> >is no IKEv2 or ESP traffic.  This differs from the recommendations
> >given in Section 2.4 of [RFC7296] in the following: the liveness
> >check should be periodically performed even if the client has nothing
> >to send over ESP.  The frequency of sending such messages should be
> >high enough to allow quick detection and restoring of broken TCP
> >connection.
> > 
> > -- Due to the change in behavior being suggested to RFC7296, did
> > the WG discuss this document formally 
> > updating it (RFC7296)?
> 
>   I don't think this is needed. The new recommendations are only
>   applicable in the context of TCP encapsulation, so all old
>   implementations remain compliant with RFC 7296.

I do not think we had this discussion in the working group, but I
agree with Valery, that I do not think this document updates RFC 7296
in any way that would affect any implementations not using TCP
encapsulation, thus if someone implements RFC 7296 and does not
implement TCP they do not need to read this document, as this does not
offer anything useful for them.
-- 
kivi...@iki.fi

___
IPsec mailing list
IPsec@ietf.org
https://www.ietf.org/mailman/listinfo/ipsec