[rtgwg] Re: [EXTERNAL] Re: My questions about draft-ietf-rtgwg-vrrp-p2mp-bfd-12

Greg Mirsky Tue, 22 Jul 2025 01:28:30 -0700

Hi Sasha,
thank you for the lively discussion. Please find my notes below, tagged
GIM3>>.


Kind regards,
Greg

On Tue, Jul 22, 2025 at 8:52 AM Alexander Vainshtein <
[email protected]> wrote:

> Greg,
>
> Lots of thanks for your email.
>
> Please see some comments to your latest responses *inline below*.
>
>
>
> I would also like to add that today EVPN with All-Active multi-homing (
> draft-rfc7432-bis
> <https://datatracker.ietf.org/doc/html/draft-ietf-bess-rfc7432bis-13>)
> provides a valid alternative to VRRP – with greatly improved performance
> and excellent scalability. If you are interested, we can discuss this
> off-the-list.
>
>
>
> Regards,
>
> Sasha
>
>
>
> *From:* Greg Mirsky <[email protected]>
> *Sent:* Monday, July 21, 2025 7:55 PM
> *To:* Alexander Vainshtein <[email protected]>
> *Cc:* RTGWG <[email protected]>
> *Subject:* Re: [EXTERNAL] Re: My questions about
> draft-ietf-rtgwg-vrrp-p2mp-bfd-12
>
>
>
> Hi Sasha,
>
> thank you for your expedient clarifications. Please find my follow up
> notes below tagged GIM2>>.
>
>
>
> Regards,
>
> Greg
>
>
>
> On Mon, Jul 21, 2025 at 6:06 PM Alexander Vainshtein <
> [email protected]> wrote:
>
> Greg,
>
> Lots of thanks for a prompt response.
>
>
>
> Please see some comments *inline below*.
>
> Hope these comments would be useful.
>
>
>
> Regards,
>
> Sasha
>
>
>
> *From:* Greg Mirsky <[email protected]>
> *Sent:* Monday, July 21, 2025 6:15 PM
> *To:* Alexander Vainshtein <[email protected]>
> *Cc:* RTGWG <[email protected]>
> *Subject:* [EXTERNAL] Re: My questions about
> draft-ietf-rtgwg-vrrp-p2mp-bfd-12
>
>
>
> Hi Sasha,
>
> thank you for your comments; much appreciated. Please find my notes below
> tagged GIM>>.
>
>
>
> Regards,
>
> Greg
>
>
>
> On Mon, Jul 21, 2025 at 3:48 PM Alexander Vainshtein <
> [email protected]> wrote:
>
> Hi all,
>
> I have asked the following question about the Applicability of Bidirectional
> Forwarding Detection (BFD) for Multi-point Networks in Virtual Router
> Redundancy Protocol (VRRP) draft
> <https://datatracker.ietf.org/doc/html/draft-ietf-rtgwg-vrrp-p2mp-bfd-12>
> during the RTGWG session in Madrid today:
>
>
>
> *Q1*:        What happens if one of the VRRP routers on the LAN in
> question does not support BFD for Multi-point network?
>
>               My guess (FWIW) that such routers would receive P2MP BFD
> packets at high rate and trap them to the CPU – and then to generate
>
>               Destination Unreachable – Port Unreachable IGMP messages…
>
> GIM>> AFAIK, routers are not expected to generate ICMP error message Port
> Unreachable. If that is the case, VRRP that does not support RFC 8562 would
> not be able to detect the failure of the Active Router and, likely, would
> not become the next Active Router.
>
> *[[Sasha]] First of all, I think routers norma;;y generate ICMP error
> messages Destination Unreachable – Port Unreachable when they receive IP
> packets addressed to them but with the TCP/UDP port for which there is no
> listener– e.g., this is how IP traceroute works when it uses UDP packets
> instead of ICMP Echo ones (the process stops when the originator receive an
> IGMP Destination Unreachable – Port Unreachable message instead of TTL
> Expired).  But this is not so important – the main point is that all these
> packets would be trapped to CPU and not to the HW accelerator used in
> modern routers to support fast BFD. Generation of IGMP error messages would
> simply increase this load.*
>
> *With 1-hop IP BFD the session would not ever reach its UP state and,
> therefore, unicast BFD Control Packets would only sent at 1 packet/second
> rate.*
>
> GIM2>> Perhaps we have different experience with implementing throttling
> of packets punted to the control plane.
>
> *[[Sasha]] Of course the packets punted to the CP will be rate-limited to
> prevent the CP from being overflown. But this is the “last line of defense”
> and intentionally creating the scenarios when you have to relay just on
> this defense looks problematic to me.*
>
GIM3>> I believe that we all subscribe to Postel's Law "Be liberal in what
you accept, and conservative in what you send".Thus, we cannot rely that
other systems always behave themselves but have sound mechanisms to protect
the control plane, including throttling the flow of the punted up packets.
Personally, I don't consider the deployment of p2mp BFD on a LAN segment as
a DoS attack as that is the management action fully controlled by the
operator, and not an accidental event.

>
>
> *Q2*:        What happens to the hosts on the LAN in question if the L2
> switches implementing this LAN flood multi-point BFD packets?
>
> My guess is that these hosts would experience high load receiving these
> packets at presumably high rate.
>
> GIM>> Would these L2 switches be flooding VRRP messages? If that is the
> case, would the hosts on the LAN segment be flooded by VRRP messages being
> transmistted at 1 decasecond interval without BFD being used for fast VRRP
> convergence? As proposed in draft-ietf-rtgwg-vrrp-p2mp-bfd
> <https://datatracker.ietf.org/doc/draft-ietf-rtgwg-vrrp-p2mp-bfd>, BFD
> Control messages source MAC address follows rules set in Section 7.3 RFC
> 9568 <https://datatracker.ietf.org/doc/rfc9568> and transmitted to "VRRP
> IPvX multicast group", as specified in Section 7.2 of RFC 9568. Am I
> missing something here?
>
> *[[Sasha]] I think your argument is not valid – because it is based on an
> assumption that VRRPv3 is really used with minimal encodable advertisement
> interval. And, IMHO, this assumption is incorrect - nobody really sends
> VRRP Advertisement messages at this rate.  Were it not so, you would not
> need fast BFD – regardless of its flavor – for fast detection of Active
> Down.*
>
> GIM2>> I might be mistaken, but the intention of VRRPv3 was to enable
> self-contained fast convergence.
>
> *[[Sasha]] This could be the intention, but the very fact that we are
> discussing BFD-based methods for improving convergence of VRRPv3
> convergence proved – at least to me – that this objective has not been
> achieved**😊**.*
>
GIM3>> Others might see it differently, but I agree with your conclusion.

> Whether that ever has been deployed, in my opinion, is a question we
> cannot answer due to stuffiest evidence.
>
> *[[Sasha]] Maybe the WG should run a poll among operators? This could be
> useful for many reasons IMHO.*
>
GIM3>> I agree with you, an anonymous poll will be helpful in many aspects.

> Any VRRPv3 implementation is capable of generating VRRP messages at 1
> decasecond (otherwise that would not be considered conformant
> implementation).
>
> *[[Sasha]] VRRP implementations indeed MUST be capable of sending 10 VRRP
> Advertisement messages per second. But I do not think there is a
> requirement to do that for hundreds – or even thousands - of VRRP groups…*
>
GIM3>> I'd note that sending the VRRP message at 1 decasecond interval
results in 100 messages/sec. Furthermore, AFAICS, VRRP specification
doesn't limit the number of VRIDs the given VRRP router may participate in.
AFAICS, Section 8.3.2 recommends having a single VRRP router with the
priority 255 per VRID. Unless I am missing it, there's no limit to the
number of VRRP routers with priority lower than 255 in the VRID. Consider
using the Unsolicited BFD for fast VRRP convergence, would that create p2p
BFD sessions between the Active Router and each VRRP router with non-zero
priority? And then, multiply that by the number of VRRP groups.

> I have also clarified that, to the best of my understanding, the purpose
> of using BFD for detection of the Active router down is to replace fast
> advertisement of VRRP Advertisement messages with something else.
>
> GIM>> Indeed, that was our motivation. *[[Sasha]] I consider this as
> agreement with my previous point – nobody wants to use very fast VRRP
> Advertisements.*
>
> With this understanding, using some form of 1-hop IP BFD (RFC 5881) looks
> obviously preferable to using P2MP BFD flavors.
>
> GIM>> We believe that using Asynchronous BFD (RFFC 5881) introduces
> significant overhead (number of BFD Control messages increases times 2
> power of N, where N is number of Backup Routers in the given VR ID, and
> unnecessary control state on the Active Router per a Backup Router that
> monitors the state of the Active Router) compared with using the Demand BFD
> mode per RFC 8562.
>
>
>
> *[[Sasha]] AFAIK, most live deployments use just two routers in a given
> VRRP group, and you need just one 1-hop IP BFD session for that.*
>
> *If you use unsolicited BFD sessions (RFC 9468
> <https://datatracker.ietf.org/doc/html/rfc9468>)  you would need just (N-1)
> sessions  where N is the number of routers in the given VRRP group – each
> Backup router would set up an unsolicited session vs. the Active one. If
> the Active router does not support it, the session would not reach its UP
> state, and the Backup router would not use it for fast detection of the
> Active Down.*
>
> GIM2>> RFC 9568 doesn't set any limit on the number of VRRP routers, nor
> it gives any deployment recommendations in the Operational Issues section,
> e.g., Recommendations Regarding Setting Priority Values. AFAICS,
> recommendation in Section 8.3.2 may be interpretted to have up to 255 VRRP
> routers in a VRID eligible to become an Active Router.
>
> *[[Sasha]] My reading of the above section is that that you cannot have
> more than 255 VRRP groups on the same LAN – but it does not say anything at
> all regarding usable number of routers in the same group.*
>
GIM3>> As I understand it, there could be up to 255 VRIDs per an extended
LAN segment. Additionally, each VRID may include an unlimited number of
VRRP routers with non-zero priority, i.e., eligible to become an Active
Router in the given VRID (Virtual Router ID).

> GIM2>> As for suggestion to use Unsoliticted BFD, I am looking to reading
> a written proposal that clarifies the use of RFC 9468 in the VRRP
> environment, particularly, when Active Router and/or Backup Router(s) go
> down and come back up.
>
> *[[Sasha]] Fron my POV this has been proposed by Asee, I wonder if he
> plans to write something. If not, maybe we could prepare such a written
> proposal?*
>
GIM3>> In my opinion, the Unsolicited BFD avoids creation of a single point
of failure by electing a single candidate for the next Active Router (as I
understand the premise of draft-ietf-rtgwg-vrrp-bfd-p2p. But the
Unsolicited BFD will result in a p2p BFD single-hop session between the
Active Router and each non-zero priority VRRP router in the given VRID,
although it avoids creating the single point of failure and does provide
robust resilience in the VRID. As I subscribe to the philosophy "Make your
worst case your normal case", I don't consider using p2p BFD as a scalable
approach for supporting VRRP fast convergence.

>
>
>
>
> Regards,
>
> Sasha
>
>
>
>
>
> *Disclaimer*
>
> This e-mail together with any attachments may contain information of
> Ribbon Communications Inc. and its Affiliates that is confidential and/or
> proprietary for the sole use of the intended recipient. Any review,
> disclosure, reliance or distribution by others or forwarding without
> express permission is strictly prohibited. If you are not the intended
> recipient, please notify the sender immediately and then delete all copies,
> including any attachments.
>
>

_______________________________________________
rtgwg mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[rtgwg] Re: [EXTERNAL] Re: My questions about draft-ietf-rtgwg-vrrp-p2mp-bfd-12

Reply via email to