Re: [IPsec] Genart last call review of draft-ietf-ipsecme-rfc8229bis-06

2022-06-01 Thread Valery Smyslov
Hi Reese,

thank you for your review! Please find my comments below.

> -Original Message-
> From: Reese Enghardt via Datatracker [mailto:nore...@ietf.org]
> Sent: Tuesday, May 31, 2022 9:21 PM
> To: gen-...@ietf.org
> Cc: draft-ietf-ipsecme-rfc8229bis@ietf.org; ipsec@ietf.org; 
> last-c...@ietf.org
> Subject: Genart last call review of draft-ietf-ipsecme-rfc8229bis-06
> 
> Reviewer: Reese Enghardt
> Review result: Ready with Nits
> 
> I am the assigned Gen-ART reviewer for this draft. The General Area
> Review Team (Gen-ART) reviews all IETF documents being processed
> by the IESG for the IETF Chair.  Please treat these comments just
> like any other last call comments.
> 
> For more information, please see the FAQ at
> 
> .
> 
> Document: draft-ietf-ipsecme-rfc8229bis-06
> Reviewer: Reese Enghardt
> Review Date: 2022-05-31
> IETF LC End Date: 2022-05-31
> IESG Telechat date: Not scheduled for a telechat
> 
> Summary: This document is well-written, clear, and concise. It is basically
> ready for publication, though I found a few nits, which could be fixed to make
> it even easier to read.
> 
> Major issues: None.
> 
> Minor issues:
> 
> Section 4:
> 
> "The maximum message length is used as the effective MTU for connections that
> are being encrypted using ESP, so the maximum message length will influence
> characteristics of inner connections, such as the TCP Maximum Segment Size
> (MSS)."
> 
> Here, it may be worth defining "inner connection" to make this paragraph 
> easier
> to understand - the "inner connection" is currently explained only later in
> Section 10.1.

How about:

The maximum message length is used as the effective MTU 
for connections that are being encrypted using ESP, 
so the maximum message length will influence characteristics of these
connections, such as the TCP Maximum Segment Size (MSS).

> Section 9:
> 
> "TCP encapsulation of IKE should therefore use standard TCP behaviors to avoid
> being dropped by middleboxes." I understand the use of "standard TCP" in a
> colloquial sense, e.g., not using any TCP extensions which might be blocked by
> middleboxes, even if the extensions are standardized in an RFC. Still, I 
> wonder
> if a different phrasing here would be clearer.

Definitely, the "standard TCP behaviors" here does not mean that TCP extensions 
should not be used. How about the following text:

TCP encapsulation of IKE should therefore use common TCP behaviors to
avoid being dropped by middleboxes.

> Nits/editorial comments:
> 
> "There is a trade-off in choosing the period of time after which TCP 
> connection
> is closed" -> "the TCP connection is closed"

Fixed.

> "With TCP encapsulation this is generally not possible, because TCP/IP stack
> manages DF bit in the outer IP header" -> "the TCP/IP stack"

Fixed.

> "The frequency of sending such messages should be high enough to allow quick
> detection and restoring of broken TCP connection." -> "connections"

Fixed.

Thank you!

Regards,
Valery.

___
IPsec mailing list
IPsec@ietf.org
https://www.ietf.org/mailman/listinfo/ipsec


Re: [IPsec] Secdir last call review of draft-ietf-ipsecme-rfc8229bis-06

2022-06-01 Thread Tero Kivinen
Valery Smyslov writes:
> Thinking a bit more about the problem:
> - IPsec stream consists mostly of random data (as result of
> encryption) with uniform distribution 
> - if an attacker manages to overwrite a single Length field, then
> the beginning of the next packet (including the next Length field) 
> will be this random data
> 
> So:
> - with probability 1-1/2^32 all subsequent packets will be treated
> as ESP packets, and not as IKE packets (the probability for 
> non-ESP marker to be non-zero)
> - the "SPIs "of these "ESP" packets will be random, so unless the
> receiver has very large number of active ESP SAs (say > 2^31), 
> these "SPIs" will be unknown to the receiver
> 
> So, it's enough to check whether the SPI of the received ESP packet is valid.
> This is a simple check that will work in most cases. If an SPI is invalid,
> the receiver should immediately reset TCP connection. I think that 
> receiving a single invalid SPI is enough reason to do this, since
> this should never happen in normal conditions.

The draft already covers most of these cases in section 7.1:

 If either the TCP Originator or TCP
   Responder detects corruption on a connection that was started with a
   valid stream prefix, it SHOULD close the TCP connection.  The
   connection can be determined to be corrupted if there are too many
   subsequent messages that cannot be parsed as valid IKE messages or
   ESP messages with known SPIs, or if the authentication check for an
   ESP message with a known SPI fails.  Implementations SHOULD NOT tear
   down a connection if only a single ESP message has an unknown SPI,
   since the SPI databases may be momentarily out of sync.

We might want to expand the "detects corruption" a bit more, i.e., add
explict sentence about length fields getting messed up and need for
resync causing this corruption.

Also as was pointed out by you in the other parts of the thread, there
is a way to resync, by searching known spi + sequence number in window
from the stream (this gives us about 56 bits to check against). After
we find such candidate framing, we can check the length before, and
validate ICV. If ICV matches, we know this is valid packet, and we can
resync, of not, we can either search again, or simply close the TCP
stream. 

On the other hand I do not think resync is needed in practice, but I
do not want to forbid by mandating that peer must tear down TCP
immediately when it receives unknown SPI, or has mismach between IKE
SA header length field and TCP stream length fields.

So I think we should add text explaining that implementation can try
to resync by by searching valid length + SPI + Sequence number + ICV
combination from the stream, or it can simply restart the TCP stream
(if it is TCP Originator, or it can close the TCP stream if it is TCP
Responder).

Then we can add text in security consideration section explaining this
bit more (you already had candidate text for that).
-- 
kivi...@iki.fi

___
IPsec mailing list
IPsec@ietf.org
https://www.ietf.org/mailman/listinfo/ipsec


Re: [IPsec] Secdir last call review of draft-ietf-ipsecme-rfc8229bis-06

2022-06-01 Thread Valery Smyslov
Hi Tero,

> > Thinking a bit more about the problem:
> > - IPsec stream consists mostly of random data (as result of
> > encryption) with uniform distribution
> > - if an attacker manages to overwrite a single Length field, then
> > the beginning of the next packet (including the next Length field)
> > will be this random data
> >
> > So:
> > - with probability 1-1/2^32 all subsequent packets will be treated
> > as ESP packets, and not as IKE packets (the probability for
> > non-ESP marker to be non-zero)
> > - the "SPIs "of these "ESP" packets will be random, so unless the
> > receiver has very large number of active ESP SAs (say > 2^31),
> > these "SPIs" will be unknown to the receiver
> >
> > So, it's enough to check whether the SPI of the received ESP packet is 
> > valid.
> > This is a simple check that will work in most cases. If an SPI is invalid,
> > the receiver should immediately reset TCP connection. I think that
> > receiving a single invalid SPI is enough reason to do this, since
> > this should never happen in normal conditions.
> 
> The draft already covers most of these cases in section 7.1:
> 
>If either the TCP Originator or TCP
>Responder detects corruption on a connection that was started with a
>valid stream prefix, it SHOULD close the TCP connection.  The
>connection can be determined to be corrupted if there are too many
>subsequent messages that cannot be parsed as valid IKE messages or
>ESP messages with known SPIs, or if the authentication check for an
>ESP message with a known SPI fails.  Implementations SHOULD NOT tear
>down a connection if only a single ESP message has an unknown SPI,
>since the SPI databases may be momentarily out of sync.
> 
> We might want to expand the "detects corruption" a bit more, i.e., add
> explict sentence about length fields getting messed up and need for
> resync causing this corruption.

I don't think this is needed. The corruption of Length field can only be 
determined
indirectly by receiving invalid SPIs or failures to validate ICV, which are 
already mentioned.

> Also as was pointed out by you in the other parts of the thread, there
> is a way to resync, by searching known spi + sequence number in window
> from the stream (this gives us about 56 bits to check against). After
> we find such candidate framing, we can check the length before, and
> validate ICV. If ICV matches, we know this is valid packet, and we can
> resync, of not, we can either search again, or simply close the TCP
> stream.
> 
> On the other hand I do not think resync is needed in practice, but I
> do not want to forbid by mandating that peer must tear down TCP
> immediately when it receives unknown SPI, or has mismach between IKE
> SA header length field and TCP stream length fields.
> 
> So I think we should add text explaining that implementation can try
> to resync by by searching valid length + SPI + Sequence number + ICV
> combination from the stream, or it can simply restart the TCP stream
> (if it is TCP Originator, or it can close the TCP stream if it is TCP
> Responder).
> 
> Then we can add text in security consideration section explaining this
> bit more (you already had candidate text for that).

I still think that it's better to put the text about resync into the Security 
considerations.
I think this method is specific for Length corruption which most probably
will happen as a result of injection attack. So, the new para in the Security
considerations would look like:

   If an attacker successfully mounts an injection attack on a TCP
   connection encapsulating IPsec traffic, and a Length field in the TCP
   stream is changed as a result of this attack, then the receiver in
   most cases will not be able to correctly identify the boundaries of
   the following packets in the stream, because it will try to parse an
   arbitrary data (the result of encryption) as ESP (or IKE) header.
   This will fail and for this reason all the following packets will be
   dropped.  Eventually the situation should self-recover, but it may
   take several minutes and may result in IKE SA deletion and re-
   creation.  To speed up the recovery implementations are advised to
   follow recommendations in Section 6.1 and to reset TCP connection in
   case incoming packets contain an ESP SPI which doesn't correspond to
   any incoming ESP SA they have.  Once the TCP connection is reset it
   will be re-created by TCP Originator as described in Section 6.1.
   Alternatively, implementations MAY try to re-sync after they received
   unknown SPI by searching the TCP stream for a 64 bit binary vector
   consisting of a combination of some SPI they know (first 32 bits) and
   a valid (E)SN for this SPI (another 32 bits) and try to validate the
   ICV of a packet candidate taking the preceding 16 bits as a Length
   field.  They can also search for 4 bytes of zero (non-ESP marker)
   followed by 128 bits of IKE SPIs of IKE SA assoc

Re: [IPsec] [Last-Call] Secdir last call review of draft-ietf-ipsecme-rfc8229bis-06

2022-06-01 Thread to...@strayalpha.com
Hi, all,

Overall, it seems sufficient to:

- highlight how much more vulnerable this tunnel makes IPsec to DOS attacks
i.e., that single packets can cause the entire connection to need to be 
reset

- indicate that there IS a way to avoid that hit by re-syncing
the sync requires a scan, but that in some cases such a scan can be 
more 
efficient in than starting a new connection, although it does mean that 
such 
an attack may be more likely moving forward (note: restarting a 
connection
might provide useful information to an attacker, where continuing may 
hide
the impact of such an attack too, so resync has cons and pros).

You can also note that there are ways to mitigate the cost of resync 
when
this implementation is tightly coupled with TCP, e.g., by ensuring 
every Nth
IPsec packet starts at the beginning of a new TCP packet.

Joe

—
Dr. Joe Touch, temporal epistemologist
www.strayalpha.com

> On Jun 1, 2022, at 6:15 AM, Valery Smyslov  wrote:
> 
> Hi Tero,
> 
>>> Thinking a bit more about the problem:
>>> - IPsec stream consists mostly of random data (as result of
>>> encryption) with uniform distribution
>>> - if an attacker manages to overwrite a single Length field, then
>>> the beginning of the next packet (including the next Length field)
>>> will be this random data
>>> 
>>> So:
>>> - with probability 1-1/2^32 all subsequent packets will be treated
>>> as ESP packets, and not as IKE packets (the probability for
>>> non-ESP marker to be non-zero)
>>> - the "SPIs "of these "ESP" packets will be random, so unless the
>>> receiver has very large number of active ESP SAs (say > 2^31),
>>> these "SPIs" will be unknown to the receiver
>>> 
>>> So, it's enough to check whether the SPI of the received ESP packet is 
>>> valid.
>>> This is a simple check that will work in most cases. If an SPI is invalid,
>>> the receiver should immediately reset TCP connection. I think that
>>> receiving a single invalid SPI is enough reason to do this, since
>>> this should never happen in normal conditions.
>> 
>> The draft already covers most of these cases in section 7.1:
>> 
>>   If either the TCP Originator or TCP
>>   Responder detects corruption on a connection that was started with a
>>   valid stream prefix, it SHOULD close the TCP connection.  The
>>   connection can be determined to be corrupted if there are too many
>>   subsequent messages that cannot be parsed as valid IKE messages or
>>   ESP messages with known SPIs, or if the authentication check for an
>>   ESP message with a known SPI fails.  Implementations SHOULD NOT tear
>>   down a connection if only a single ESP message has an unknown SPI,
>>   since the SPI databases may be momentarily out of sync.
>> 
>> We might want to expand the "detects corruption" a bit more, i.e., add
>> explict sentence about length fields getting messed up and need for
>> resync causing this corruption.
> 
> I don't think this is needed. The corruption of Length field can only be 
> determined
> indirectly by receiving invalid SPIs or failures to validate ICV, which are 
> already mentioned.
> 
>> Also as was pointed out by you in the other parts of the thread, there
>> is a way to resync, by searching known spi + sequence number in window
>> from the stream (this gives us about 56 bits to check against). After
>> we find such candidate framing, we can check the length before, and
>> validate ICV. If ICV matches, we know this is valid packet, and we can
>> resync, of not, we can either search again, or simply close the TCP
>> stream.
>> 
>> On the other hand I do not think resync is needed in practice, but I
>> do not want to forbid by mandating that peer must tear down TCP
>> immediately when it receives unknown SPI, or has mismach between IKE
>> SA header length field and TCP stream length fields.
>> 
>> So I think we should add text explaining that implementation can try
>> to resync by by searching valid length + SPI + Sequence number + ICV
>> combination from the stream, or it can simply restart the TCP stream
>> (if it is TCP Originator, or it can close the TCP stream if it is TCP
>> Responder).
>> 
>> Then we can add text in security consideration section explaining this
>> bit more (you already had candidate text for that).
> 
> I still think that it's better to put the text about resync into the Security 
> considerations.
> I think this method is specific for Length corruption which most probably
> will happen as a result of injection attack. So, the new para in the Security
> considerations would look like:
> 
>   If an attacker successfully mounts an injection attack on a TCP
>   connection encapsulating IPsec traffic, and a Length field in the TCP
>   stream is changed as a result of this attack, then the receiver in
>   most cases will not be able to correctly identify the boundaries of
>   the following packets in the stream, because it will try to parse an
>   

Re: [IPsec] [secdir] [Last-Call] Secdir last call review of draft-ietf-ipsecme-rfc8229bis-06

2022-06-01 Thread Valery Smyslov
Hi Joe,

 

From: secdir [mailto:secdir-boun...@ietf.org] On Behalf Of to...@strayalpha.com
Sent: Wednesday, June 01, 2022 5:43 PM
To: Valery Smyslov
Cc: draft-ietf-ipsecme-rfc8229bis@ietf.org; sec...@ietf.org; 
ipsec@ietf.org; last-c...@ietf.org
Subject: Re: [secdir] [Last-Call] [IPsec] Secdir last call review of 
draft-ietf-ipsecme-rfc8229bis-06

 

Hi, all,

 

Overall, it seems sufficient to:

 

- highlight how much more vulnerable this tunnel makes IPsec to DOS attacks

  i.e., that single packets can cause the entire connection to need to 
be reset

 

  Already in the draft.

 

- indicate that there IS a way to avoid that hit by re-syncing

  the sync requires a scan, but that in some cases such a scan can be 
more 

  efficient in than starting a new connection, although it does mean 
that such 

  an attack may be more likely moving forward (note: restarting a 
connection

  might provide useful information to an attacker, where continuing may 
hide

  the impact of such an attack too, so resync has cons and pros).

 

  I’ve added some text about the possibility of a resync. 

  About information for an attacker – if we talk about off-the-path 
attacker,

  then in general it is unaware of the results of its attack, it may 

  only guess them indirectly. And since resync will take some time and 

  thus introduce some delay, that may be visible to attacker, I’m not 
sure 

  which method is preferable in this context – everything depends on

  specific conditions for the attacker.

 

  You can also note that there are ways to mitigate the cost of resync 
when

  this implementation is tightly coupled with TCP, e.g., by ensuring 
every Nth

  IPsec packet starts at the beginning of a new TCP packet.

 

  Regards,

  Valery.

 

Joe

 

—

Dr. Joe Touch, temporal epistemologist

www.strayalpha.com





On Jun 1, 2022, at 6:15 AM, Valery Smyslov  wrote:

 

Hi Tero,




Thinking a bit more about the problem:
- IPsec stream consists mostly of random data (as result of
encryption) with uniform distribution
- if an attacker manages to overwrite a single Length field, then
the beginning of the next packet (including the next Length field)
will be this random data

So:
- with probability 1-1/2^32 all subsequent packets will be treated
as ESP packets, and not as IKE packets (the probability for
non-ESP marker to be non-zero)
- the "SPIs "of these "ESP" packets will be random, so unless the
receiver has very large number of active ESP SAs (say > 2^31),
these "SPIs" will be unknown to the receiver

So, it's enough to check whether the SPI of the received ESP packet is valid.
This is a simple check that will work in most cases. If an SPI is invalid,
the receiver should immediately reset TCP connection. I think that
receiving a single invalid SPI is enough reason to do this, since
this should never happen in normal conditions.


The draft already covers most of these cases in section 7.1:

If either the TCP Originator or TCP
  Responder detects corruption on a connection that was started with a
  valid stream prefix, it SHOULD close the TCP connection.  The
  connection can be determined to be corrupted if there are too many
  subsequent messages that cannot be parsed as valid IKE messages or
  ESP messages with known SPIs, or if the authentication check for an
  ESP message with a known SPI fails.  Implementations SHOULD NOT tear
  down a connection if only a single ESP message has an unknown SPI,
  since the SPI databases may be momentarily out of sync.

We might want to expand the "detects corruption" a bit more, i.e., add
explict sentence about length fields getting messed up and need for
resync causing this corruption.


I don't think this is needed. The corruption of Length field can only be 
determined
indirectly by receiving invalid SPIs or failures to validate ICV, which are 
already mentioned.




Also as was pointed out by you in the other parts of the thread, there
is a way to resync, by searching known spi + sequence number in window
from the stream (this gives us about 56 bits to check against). After
we find such candidate framing, we can check the length before, and
validate ICV. If ICV matches, we know this is valid packet, and we can
resync, of not, we can either search again, or simply close the TCP
stream.

On the other hand I do not think resync is needed in practice, but I
do not want to forbid by mandating that peer must tear down TCP
immediately when it receives unknown SPI, or has mismach between IKE
SA header length field and TCP stream length fields.

So I think we should add text explaining that implementation can try
to resync by by searching valid length + SPI + Sequence number + ICV
combination from the stream, or it can simply restart the TCP stream
(if it is TCP Originator, or it can clos