Re: Secdir early review of draft-ietf-bfd-stability-13

Jeffrey Haas Mon, 10 Jun 2024 09:22:33 -0700

Christian,

Thanks for your review.  Some of my comments will overlap those from Alan.

On Fri, Jun 07, 2024 at 09:54:57PM -0700, Christian Huitema via Datatracker 
wrote:
> The authentication sequence number is a 32 bit field. Such numbers can roll
> over, either after a long duration session or due to a packet injection 
> attack.

As Alan points out, normal rollover is something we're usually unbothered by
in the existing authentication algorithms.

The point you have here is far more about the underlying issue for the null
authentication procedures:

> There is some text about that in the description of the NULL authentication. 
> It
> says:
> 
>    If bfd.AuthSeqKnown is 1, and the received Sequence Number field is
>    not equal to bfd.RcvAuthSeq + 1 (in a circular number space), then
>    the loss count is incremented by one and bfd.RcvAuthSeq is set to the
>    received Sequence Number.
> 
> That does not look quite right. Suppose that due to out of order delivery, the
> packets are received in order 1-3-2-4. Upon reception of packet 3, the
> algorithm counts one loss and set the next expected value to 4. After packet 
> 2,
> another loss and expected value to 3. After packet 4, another loss and 
> expected
> value to 5. So, three losses when none actually occurred.

Agreed.  We do mention this here:

: Implementations MAY provide mechanisms wherein all expected packets received
: across an expected interval but delivered out of order are not considered
: lost packets.

We indeed discussed the option about how to avoid some of these out of order
issues as part of active attacks vs. BFD sessions with NULL authentication.
The conclusion from that thread is we simply CANNOT leverage the sequence
numbers for purposes of "do we pass the authentication checks".  As you note
here:

> In RFC 5880, the specification of Meticulous Keyed MD5 addresses both number
> rollover and out of order delivery. The same text is repeated for meticulous
> MD5 and meticulous SHA1:
> 
>       ... if the
>       sequence number lies outside of the range of bfd.RcvAuthSeq+1 to
>       bfd.RcvAuthSeq+(3*Detect Mult) inclusive (when treated as an
>       unsigned 32-bit circular number space) the received packet MUST be
>       discarded.
> 
> That means that if a the packet 1-2-3-4 are delivered out of order as 1-4-2-3,
> then packets 2 and 3 are going to be ignored and 1 loss will be detected.
> That's probably not right.

For our authentication purposes, without the presence of some sort of
computed digest across the packet, NULL authentication means that an active
attacker can knock the session over simply by injecting packets that push
the sequence numbers ahead.  Without the sequence number  windowing check,
it's trivial.  With the sequence numbering window, it still becomes possible
by simply hitting the right window and increases the window out of the range
of the real client too fast.  

Trying to further mitigate that using a pacing check becomes problematic in
most implementations and isn't natural for current BFD implementations.

The conclusion was thus we can't use NULL for authentication.  We're only
interested in it for the stability check.

However, as you note, in the event of regular packet reordering, we start
bumping the counters too hard.

What can you do about this?

Nothing.

We *could* mitigate active attempts to bump the counters, and perhaps that's
reasonable to restore the check on the sequence windowing to try to remove
some noise from the system.  However, outside of that work... if you're
having this issue occur with ANY authentication mechanism regularly, it
means you have a problem.

The best this mechanism can do is help provide you notice of issues.
Outright dropped packets tells you there is loss.  Regular out of order
delivery is a notice of out of order.  Past this, you can notify the client
of issues and that's the entire value proposition here.

> For packet loss detection, the state of the art is described in RFC 8985, the
> RACK-TLP Loss Detection Algorithm for TCP, and in RFC 9002 which describes how
> to use RACK-TLP for packet loss discovery in QUIC. I don't understand why the
> BFD Stability draft does not reference the RACK-TLP algorithm and RFC8985.

Minimally, it's the usual IETF delusion that everyone is aware of every
possible technology that has been written to an Internet-Draft. :-)  That is
the point of the broader reviews.

>  The
> "ad hoc" BFD loss detection algorithm description ends up repeating a lot of
> points commonly discussed in the specification of RACK: detecting packet 
> losses
> by monitoring holes in sequence numbers, the interaction between this 
> detection
> and out of order delivery, or role and computation of timers. The BFD 
> stability
> draft would probably be more robust if it merely adapted the algorithms 
> defined
> in RFC8995.

Valuable feedback.  Sending this back as "see this RFC, see if you can
benefit from it" is a reasonable request.

> Copying algorithms from RFC8985 or RFC9002 would probably also diminish the
> reliance on timers, allow use of adaptative timers instead of relying on 
> timers
> set by management procedure, and allow reporting of RTT statistics in the Yang
> module.

Sorry, you don't get to tweak the timers.  The core use of BFD remains.
We're simply trying to piggyback loss detection on top of what we already
have as an add-on.

> The security considerations acknowledge that using NULL authentication will
> allow attackers to mimic out of order delivery and thus cause spurious loss
> detection. Indeed, this NULL Authentication procedure makes me cringe. Why are
> we even defining that today, in 2024?

Clearly you're unaware of the fact that trying to do expensive cryptography
in low CPU power devices that have other things to do that compute digests
is an attack.  But that puts you in regular company for every security
reviewer for BFD since its inception. 

The vast majority of BFD deployment remains without use of any authenciation
whatsoever.  The use of md5 and sha1 happens in environments where active
attacks are of concern, but with significant negative impact to scale of the
BFD sessions, their detection times, and scale of work the line cards end up
doing.

This draft highlights the fact that when you have missing packets detected
by the sequence numbers, you can report it.  If that's on one of the
existing meticulous modes, your concern is addressed.  If you want similar
behavior for unauthenticated sessions, NULL gives you at least an option.
It presents it with the same security guarantees you get with no
authentication.  Many people are fine with that.

>  Naive features like that can easily turn
> into a CVE if the attackers used them as steps in an attack chain, causing
> spurious error detection to convince routers that a path is not usable 
> anymore,
> and thus maybe causing connectivity to  a target to fail for the short time
> necessary for an attack. It would be much safer to just require the MD5 or 
> SHA1
> variants, and not define NULL authentication at all.

See above.

> But MD5 and SHA1 authentication can also be attacked, because they do not
> protect against replay attacks. If the sessions last long enough for the
> sequence number to roll over, a patient attacker could record old packets and
> replay them after the rollover. If the repeated packets pass the "range" test,
> the bfd.RcvAuthSeq value will be modified, causing "good" packets below the 
> new
> range to be ignored. Repeat that a couple of time, and the BFD process will be
> effectively disabled. I think that the solution is to not allow rollover, or
> maybe require use of new descriptors if too many packets have been exchanged.
> In any case, the security section should discuss the issue.

Not really.  But the usual giant leap security directorate reviews get
without having been through the prior RFC 5880 reviews:

:    bfd.LocalDiscr
: 
:       The local discriminator for this BFD session, used to uniquely
:       identify it.  It MUST be unique across all BFD sessions on this
:       system, and nonzero.  It SHOULD be set to a random (but still
:       unique) value to improve security.  The value is otherwise outside
:       the scope of this specification.

Thus, for authenticated systems using md5/sha1, the digest includes
appropriate nonces.  Given the rollover time, it's not a reasonable attack.

If you want to provide a valid but silly attack, echoing a packet that was
just sent two packets ago can produce such detection errors in authenticated
systems.

A general observation on security in BFD is that once you're to the point of
actively trying to attack BFD, you're already in a place in the network that
if your goal was to bring BFD down, you're better off just spoofing ARP or
other things that are far easier attacks.

> Of course, the security of MD5 and SHA1 is also suspect. There is no 
> indication
> yet that SHA1 with the HMAC construct used in RFC5880 is broken, but the
> general rule is no avoid SHA1 if we can -- and certainly avoid MD5. I don't
> know the deployment requirements, but it would be nice if BFD allowed stronger
> options.

Cryptography at speed on small packets is an attack on the line cards.  They
have better things to do.  This means we're not looking for perfect
security, we're looking for appropriate speed bumps.

That said, please also see:
https://datatracker.ietf.org/doc/html/draft-ietf-bfd-optimizing-authentication-13

This work was chartered to allow for stronger authentication mechanisms such
as SHA-2 to be used, but somewhat sparingly.

>  Surely, some deployments will want to use different authentication
> methods in the future. Should there be some guidance on the development of
> these future extensions, such as requiring that they provide a sequence number
> similar to the "meticulous" variants?

https://datatracker.ietf.org/doc/html/rfc7492, section 6:
:    The security risks brought by SHA-1 and MD5 have been well
:    understood.  However, when using a stronger digest algorithm, e.g.,
:    SHA-2, the imposed computing overhead will seriously affect the
:    performance of BFD implementation.

It then goes on to suggest additional ciphers.  Fundamentally, there's room
for stronger ciphers, but the tradeoff will always apply.  The optimizing
procedures in the draft cited above lets us change the equation by making
BFD require strong authentication less of the time.

> At this point, let me indulge in a personal aparte. Why do we even have a
> specific BFD protocol? 

It seems we're back to the need to have the "why bfd 101" for IETF again.
You speculate, but don't quite hit the mark.

Fundamentally, common routing protocol and other control plane and data
plane requirements need a simple bidirectional connectivity check.  IETF
protocols would embed flavors of this in each of the protocols, but often
with seconds-plus granularity timers to address the scaling capabilities of
the control plane software.

Rather than try to make every IETF control plane protocol provide faster
timers, BFD provides generic plumbing that can be used by various clients to
provide that check.

Similarly, dataplane failure mechanisms such as fast reroute can leverage
BFD as a triggering mechanism.

It's a dull protocol, but it's been quite successful.

> rollover, more precisely handle out of order packets, maybe add timestamps. As

We've explicitly put timestamp work outside the scope of BFD.  Over the
years we've considered adding new things into BFD, but each of those new
things often degraded the core use case.

Or, as my usual challenge to new proposals usually was: What is so
interesting you want to do it every 10ms?

Instead, we've encouraged other applications that are interested in
leveraging the session setup semantics to adopt the state machinery if they
find it interesting.  But please don't try to stuff it into BFD.

> BFD is used to monitor data plane quality, using the same transport protocol 
> as
> applications would make sense.

As our experts in OAM, like Greg Mirsky, would remind us: We're only using
BFD for connectivity checks.  Other semantics for outright quality are
mostly out of scope because it requires additional machinery.

This draft doesn't try to address quality.  It simply propagates up
something the protocol is already capable of noting: expected stuff is lost.

>  These functions are all well established in
> transport protocols. I understand that TCP+TLS would not be practical, but we
> are not in 2010 anymore and we have more options. For example, it would be
> trivial to define "BFD over QUIC", while providing modern security, congestion
> control, timer management and loss detection. A BSD over QUIC application 
> could
> update the Yang module just like BFD does, while providing much better 
> security!

While I'm working on other work targeting QUIC, I think targeting BFD for it
would be ludicrous.  That said, let's focus on the more reasonable portions
of your review.

-- Jeff

Re: Secdir early review of draft-ietf-bfd-stability-13

Reply via email to