Re: AD Evaluation Review of draft-ietf-bfd-secure-sequence-numbers-20

Jeffrey Haas Tue, 20 May 2025 09:58:24 -0700

Ketan,

A few targeted responses follow.

> On May 15, 2025, at 9:48 AM, Ketan Talaulikar <[email protected]> wrote:
> 
> 185 4.  Architecture of the Auth Type Method
> 
> 187   When BFD uses authentication, methods using MD5 or SHA1 are CPU
> 188   intensive, and can negatively impact systems with limited
> 189   computational power.
> 
> <minor> Do we want to add that the BFD function has been offloaded to 
> hardware or
>  lower powered CPUs or something like that in some router implementations ? 
> Or,
> better still move all these discussions to the optimizing authentication 
> document since
> those considerations apply there and are not specific to these two new types.

More appropriate: Stop saying they don't have cost and that there isn't impact. 
:-)

> 
> 
> 475   There is no negotiation as to when authentication switches from the
> 476   original type, to using Meticulous Keyed ISAAC.  The sender simply
> 477   begins authentication with a relevant Auth-Type, and with the
> 478   Optimized Authentication Mode field set to 1.  When the sender
> 479   switches to using using Meticulous Keyed ISAAC Authentication, it
> 480   sets the Optimized Authentication Mode field to 2, and starts
> 481   performing the ISAAC calculations as described here.
> 
> <major> If the switch to ISAAC is not strictly specified, and implementations
> are free to switch back/forth whenever they desire, does that not open up a
> security hole where an attacker could exploit these transitions themselves or
> cause a transition? Why not mandate that switch to/from happen immediately
> after a state transition? Better still why not all this is specified in the
> optimizing-auth draft and this document only points to that spec? Am I
> missing something?

ISAAC is only suitable for use in BFD for the optimized BFD procedures.  That 
document governs when the use of ISAAC can be done.

And yes, you're seeing that these documents are deeply entangled.  You're 
asking for conflicting things, but that at least mirrors some of the difficulty 
this document had during the work.

You're asking for the optimized draft to remove ISAAC specific procedures.

You're saying here that there are things you think that document should say 
more about. :-)

The motivation for "why not switch immediately" is covered in teh text for 
section 10.2 about how to deal with packet loss.  

Dealing with your confusion could be dealt with to a forward reference to that 
procedure (that's the appropriate place for the normative procedure).

> 
> 575   Note that this construct requires that the "Your Discriminator" field
> 576   not change during a session.  However, it does allow the "My
> 577   Discriminator" field to change as permitted by RFC5880 Section 6.3
> 578   [RFC5880]
> 
> <major> If My Discriminator is changed, then does that not result in the 
> change of "Your Discriminator" for the other end? RFC5880 says the
> implications of discriminator change are outside scope. Perhaps the same can 
> be
> said here or better still make it a prerequisite that there is no
> discriminator change while the session is in UP state (somewhat similar to
> what is indicated for the Seed a couple of paragraphs below)?

Since the point is that it's echoed back, for your discriminator to change it'd 
require the system to choose to send something new.

However, since ISAAC is bootstrapped with the Your Discriminator at that time 
of transition.  It's a one-time operation and the value becomes irrelevant once 
ISAAC is initialized.

The fix here is to remove the text above and to note that the PRNG is 
initialized once.  We clearly missed this one-time property during the draft's 
life.

> 650   If, however, the packet's Sequence Number differs from the expected
> 651   value, then the difference "N" indicates how many packets were lost.
> 652   The receiver then has to search through the first "N" Auth Keys
> 653   derived from its calculated ISAAC state in order to find one which
> 654   matches.  If no key matches the Auth Key in the packets, the packet
> 655   is deemed to be inauthentic, and is discarded.
> 
> <major> If the sequence is incrementing monotonically, then why is there a
> need to look at multiple records instead of only the offset between the
> previous and the received number? Perhaps there is something here that I
> have not understood ...

The indexing through the table which covers your confusion below.

When ISAAC is bootstrapped, it produces a page of numbers.  The exercise we're 
doing here is saying that BFD sequence number N corrects to index i in the 
table.

However, due to possible packet loss, what BFD sequence number is intended to 
map to the first entry?  This means during this step in the event of loss, it 
may be necessary to search through the first few entries of the table for the 
matching value.  Once a match is located, you have matched the expected 
sequence number to an index in the table.  This is the "entraining" step.

The search space is no more than the total number of BFD packets which can be 
lost before BFD's state machine decides that it is going down.  This is on the 
order of 3 for a Detect Mult of 3 which is common configuration.

> 
> 657   If a calculated key at index "I" does match the Auth Key in the
> 
> <major> what is this index "I"?
> 
> 659   this value.  The bfd.MetKeyIsaacRcvAuthBase field is then initialized
> 660   to contain the value of bfd.RcvAuthSeq, minus the value of
> 661   bfd.MetKeyIsaacRcvAuthIndex.  This process allows the pseudo-random
> 662   stream to be re-synchronized in the event of lost packets.
> 
> <question> Isn't it problematic to do this? What if it is an attack? Or, 
> likely I 
> may be missing something ...

Hopefully the discussion on the entraining step above helps.

> 
> 669   This document does not make provisions for dealing with the case of
> 670   losing more than 256 packets.  Implementors should limit the value of
> 671   "Detect Multi" to a small number in order to keep the number of lost
> 672   packets within an acceptable limit.
> 
> <major> Isn't this a MUST? Also, this is something for the operator and is
> better if placed in an operational considerations section of its own for
> proper focus. Same goes for other similar considerations.

The BFD Detect Mult is one octet, so up to 256 packets can effectively be lost 
before BFD goes to Down.  Typically, operators do not configure numbers that 
are very high.

It's inappropriate for this document to try to regulate how operators tune the 
Detect Mult for convergence timing.  However, noting that this value has an 
impact on being able to use ISAAC is appropriate.

https://datatracker.ietf.org/doc/html/rfc7419 contains some generalized 
guidance for common timings, but even then it mostly uses the usual 3 for 
Detect Mult.

> 
> 759 12.  Transition away from using ISAAC
> 
> <major> Don't we want to put a reference to 
> [I-D.ietf-bfd-optimizing-authentication]
> to indicate how the switching to/from ISAAC is done?
> 
> 767   Since Meticulous Keyed ISAAC Authentication does not provide for full
> 768   packet integrity checks, it may be desirable for a party to
> 769   periodically use a strong Auth Type.  The switch to a different Auth
> 770   Type can be done at any time during a session.  The different Auth
> 771   Type can signal that the session is still in the Up state.
> 
> <major> This seems somewhat conflicting with the previous section where the 
> start
> of ISAAC auth mode was not immediately after a state change or the use of
> strong Auth Type. What is stated here is much more desirable than doing a
> switch where there is no auth mode enabled.

ISAAC can't be used at the start of a BFD session without entraining.

Any BFD auth mode that includes meticulous sequence numbering can be leveraged 
to entrain ISAAC.  This includes the NULL Auth Type that is defined in BFD 
stability.

Flipping from ISAAC to any other auth mode currently specified can be done as 
long as it's done between consensual parties.  RFC 5880 largely says "you 
change once", largely because the considerations on the impact to the session 
and the lack of graceful key rollover are not dealt with in the base 
specification.

What intended here is to permit the appropriate room in the ISAAC portion of 
the document to enable the strong authentication mode checks in the optimizing 
document.

Here, again, we have conflicting goals of "split out ISAAC from optimized auth" 
and "justify this procedure because optimized auth needs it".

> 
> 778   The nature of Meticulous Keyed ISAAC Authentication means that there
> 779   is no issue with this switch, so long as it is for a small number of
> 780   packets.  From the point of view of the Meticulous Keyed ISAAC state
> 781   machine, this switch can be handled similarly to a lost packet.  The
> 782   state machine simply notices that instead of Sequence Number value
> 783   being one more than the last value used for ISAAC, it is larger by
> 784   two.  The ISAAC state machine then calculates the index into the
> 785   current "page", and uses the found number to validate (or send) the
> 786   Auth Key.
> 
> <major> I find this strange and one is expected to retain the ISAAC pages even
> when the auth mode changes. Based on previous section, the receiver
> initializes when it receives the first packet with the ISAAC auth mode. Am I
> missing something? The clean start also enables one to change the seed,
> discriminator, etc.

You saw in prior sections that you only initialize the state machine when you 
transition to Up.

Once you're Up, the mechanism to index into an ISAAC table depends on the BFD 
sequence numbers that are kept identical between the differing modes.

What this means is that if you switch out of ISAAC, in order to keep valid 
tables, you have to keep a table that contains the relevant index.  This means 
you need to keep the possible "current page" around even if you're not using 
ISAAC at the moment so you can return to it.

To do otherwise would require rewriting the entire procedure to allow reseeding 
the PRNG from scratch for a session that is already up.  There is insufficient 
information in the protocol to permit that.

> 
> 
> <major> I am not sure why this complication is needed with creating this IANA
> dependency with the optimizing authentication draft. Please just ask for
> allocating the code points 7 and 8 for the two new ISAAC auth types in this 
> document. What would be the issue with doing so?

See prior comments about YANG motivations in the other document.

> 
> 818   The security of this proposal depends strongly on ISAAC.  This
> 819   generator has been analyzed for almost three decades, and has not
> 820   been broken.  Research shows that there are few other CSRNGs which
> 821   are as simple and as fast as ISAAC.  For example, many other
> 822   generators are based on AES, which is infeasibe for resource
> 823   constrained systems.
> 
> <minor> perhaps provide some BFD offload context here?

No magic panacea is available here.

> 
> 
> 838   The Auth Type method defined here allows the BFD end-points to detect
> 839   a malicious packet, as the calculated hash value will not match the
> 840   value found in the packet.  The behavior of the session, when such a
> 841   packet is detected, is based on the implementation.  A flood of such
> 842   malicious packets may cause a BFD session to be operationally down.
> 
> <major> Could you please elaborate on why this would be the case? And why the
> behavior is based on the implementation when this spec says that those 
> packets are
> to be discarded?

Tersely, BFD says "you get to miss N packets and then it goes down".

Volumetric attacks may cause packet loss.

Lose the wrong packets, the session drops.

This is also one of the fundamental issues with stronger cryptographic ciphers 
for BFD and other datagram based services using such authentication.  The 
cryptographic validation operation is, at a lower scale than a similar 
volumetric attack, an attack on the receiver.  You can spew nonsense at a 
receiver, and it has to do expensive cryptography.

The validation procedures for BFD packets on receipt provides a mitigation to 
such attacks.

BFD sessions not in the Up state need to rely on transport security (e.g. GTSM) 
and rate limiting to mitigate the cryptographic attacks.

> 
> 871   However, the usual actual attack which we are protecting BFD from is
> 872   availability.  That is, the attacker is trying to shut down then
> 873   connection when the attacked parties are trying to keep it up.  As a
> 874   result, the attacks here seem to be irrelevant in practice.
> 
> <major> I don't buy this argument, giving the false impression that a link is
> up when it is down will result in dropping of packets and is equally harmful.
> Is it necessary to call this "irrelevant"?

BFD security in many circumstances, is "silly".  This has been one of the 
reasons why our dance with the security ADs each time a BFD feature is going 
through the IESG becomes frustrating.

If you want to knock a BFD session over, you can try to attack the protocol to 
do so. Your attack space certainly includes trying to pass packet validation 
procedures and change the state in such a way that the FSM moves to Down.  
However, far easier tends to be to attack BFD at the transport layer through 
volumetric attacks, interfere with forwarding, or interfere with inter-layer 
transport considerations such as tunnels, ARP, etc.  Once you've capable of 
those attacks, you wouldn't bother to attack BFD at the authentication layer.

On the flip side, the other attack mitigated by BFD is that BFD wants the 
session to go Down either by carrying out a protocol state change or far more 
typically because there is an interruption in the continuity of a session.  If 
you've compromised the authentication, you can keep the session up, but will in 
many cases be required to participate in a man-in-the-middle attack to be able 
to do so.  Having compromised BFD to spoof "it's still up", you have to deal 
with the fact that BFD is deployed most typically as a supplement to protocol 
keepalive machinery that will at a slower pace validate the continuity and 
those layers (the protected clients) will need to be addressed in the attack.

Where this has traditionally put us at as a working group is that there are 
motivations to make it difficult to not let unauthenticated packets knock the 
service over and not let spoofed packets keep the session up, all while trying 
to not exhaust limited resources including line cards to do the authentication. 

-- Jeff

Re: AD Evaluation Review of draft-ietf-bfd-secure-sequence-numbers-20

Reply via email to