Re: Service Redundancy using BFD

Ashesh Mishra Wed, 29 Nov 2017 04:29:33 -0800

Sami,

Thanks for the clarification. In a typical scenario, this will look like:


A <-------------\
            \           \
             C            D
            /           /
B <-------------/

were there are two sets of services. The BFD session between A and B in this 
case will be overloaded with the states for the two sets of sessions. It’s not 
clear from the proposal if this scenario is addressed (and how).

Ashesh
From: Sami Boutros <[email protected]>
Date: Tuesday, November 28, 2017 at 5:13 PM
To: Ashesh Mishra <[email protected]>, Ankur Dubey <[email protected]>, 
"[email protected]" <[email protected]>
Cc: Reshad Rahman <[email protected]>
Subject: Re: Service Redundancy using BFD

Hi Ashesh,

The topology is more like the following:

A <—\
|         \
BFD      C
|         /
B<—/

A and B are nodes providing L2 and L3 services for C, with A/S redundancy.

A can be active and B standby, if A goes down then B start providing the 
services.

Thanks,

Sami
From: Ashesh Mishra 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, November 28, 2017 at 1:45 PM
To: Sami Boutros <[email protected]<mailto:[email protected]>>, Ankur Dubey 
<[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: Reshad Rahman <[email protected]<mailto:[email protected]>>
Subject: Re: Service Redundancy using BFD

Okay. That makes sense now.

So in a scenario where you have a primary overlay service between A and B, and 
a backup overlay service between C and D, the BFD sessions in question will be 
between A and C, and B and D (so that the backup can send diag code to primary)?

A <------- primary service --------->B
|                                                           |
BFD                                                    BFD
|                                                           |
C<-------- backup service ---------->D

--
Ashesh


From: Sami Boutros <[email protected]<mailto:[email protected]>>
Date: Tuesday, November 28, 2017 at 4:21 PM
To: Ashesh Mishra 
<[email protected]<mailto:[email protected]>>, Ankur Dubey 
<[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: Reshad Rahman <[email protected]<mailto:[email protected]>>
Subject: Re: Service Redundancy using BFD

Hi Ashesh,

A service is an overlay service running on a routing node, this could be a L2 
or L3 VPN service running on set of links connected to 2 or more nodes, where 
one node is active for a service at a given point in time, and one node is 
standby.

Now, BFD is running on underlay links between the 2 nodes active and standby, 
once BFD goes down, the standby assumes that the active went down and activates 
the services that it shares with the active. On the BFD session the standby 
would signal to the old active when it came back up that it activated the 
non-preemptive services via this diag code saying that it didn’t fail, so the 
old active node doesn’t activate those non-preemptive services.

Thanks,

Sami
From: Ashesh Mishra 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, November 28, 2017 at 1:14 PM
To: Sami Boutros <[email protected]<mailto:[email protected]>>, Ankur Dubey 
<[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: Reshad Rahman <[email protected]<mailto:[email protected]>>
Subject: Re: Service Redundancy using BFD

Thanks for the response, Sami. I think our disconnect lies in the definition of 
a service. From a BFD perspective, I expect the service to be established 
across two nodes, at the very least, so that BFD can monitor its liveness. Can 
you elaborate on


-          What, in the context of this draft, a service is?

-          How does BFD signal for a service that it is not monitoring the 
liveness for?

Thanks,
Ashesh

From: Sami Boutros <[email protected]<mailto:[email protected]>>
Date: Tuesday, November 28, 2017 at 1:23 PM
To: Ashesh Mishra 
<[email protected]<mailto:[email protected]>>, Ankur Dubey 
<[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: Reshad Rahman <[email protected]<mailto:[email protected]>>
Subject: Re: Service Redundancy using BFD

Hi Ashesh,

Thanks for your comments.

For your first comment the draft applies to both single hop or what you call 
interface BFD and multi hop BFD too. And yes the per service could be per 
interface too if this is a single hop BFD, we can clarify that in the draft.

For your second comment, I am not sure I understand. The service will be active 
only on one node, if the service is associated with the whole node, then the 
BFD session is monitoring the node liveness. And when the service is associated 
with an interface the BFD session will monitor the interface connectivity as 
well. So, a primary service can’t be active at the 2 node endpoints hosting the 
BFD session.

Thanks,

Sami
From: Ashesh Mishra 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, November 28, 2017 at 4:04 AM
To: Ankur Dubey <[email protected]<mailto:[email protected]>>, 
"[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: Reshad Rahman <[email protected]<mailto:[email protected]>>, Sami Boutros 
<[email protected]<mailto:[email protected]>>
Subject: Re: Service Redundancy using BFD

Hi Ankur,

This is a good proposal to pursue within the BFD-wg.

Couple of comments:

-          BFD can only signal this diag code for the interface that it is 
monitoring (the IP next hop, MPLS LSP, etc.). You mention per-service (which I 
assume means per-service-per-interface) failover in the draft but it may be 
worthwhile defining behavior on per-service-type-per-interface as well.

-          There still needs to be a method for the primary and backup pairs 
(two BFD end-points on primary service and two on backup service) to 
communicate with each other (primary-to-primary and backup-to-backup) if the 
service is active or standby. This is useful in the scenario when the primary 
cannot communicate with backup nodes (it is a failure condition after all).

Again, at 10k ft, I like the idea of signaling active/standby using BFD.

Cheers,
Ashesh

From: Rtg-bfd <[email protected]<mailto:[email protected]>> on 
behalf of Ankur Dubey <[email protected]<mailto:[email protected]>>
Date: Monday, November 27, 2017 at 9:47 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Cc: Reshad Rahman <[email protected]<mailto:[email protected]>>, Sami Boutros 
<[email protected]<mailto:[email protected]>>
Subject: Service Redundancy using BFD

Hi all,

Please review and provide comments for the following draft:

https://datatracker.ietf.org/doc/draft-adubey-bfd-service-redundancy/<https://urldefense.proofpoint.com/v2/url?u=https-3A__datatracker.ietf.org_doc_draft-2Dadubey-2Dbfd-2Dservice-2Dredundancy_&d=DwMGaQ&c=uilaK90D4TOVoH58JNXRgQ&r=IVzcTRLQdpta08L0b_y2zDkqvwJhRKMCAbX-2K-LV98&m=3D1zKBUXYinynnVWgCSqOkn4ccSIcx6rzDitjPm2dfs&s=d4DdCstEXxJ0sOJ09fOaHRCfpS3chnYNcuVWImRCcFQ&e=>






Summary of draft:

This draft proposes a new BFD diag code via which a node running a BFD session 
with another node, can inform the other node after a BFD session times out, 
that it didn’t go down and did live through the failure.

Such notification is useful for a set of nodes providing Active/Standby 
redundancy. When these nodes are running multiple L2/L3/L4-L7 services  in 
non-revertive mode of redundancy, the standby node taking over as active for 
non-revertive services after BFD times out needs to indicate in the BFD packet 
that it outlived the other failed old active node. The new diag code will be 
used for this purpose. When this diag code is set in the BFD packets, it will 
provide an indication to the failed old active node that it MUST NOT activate 
the non-revertive services when it comes up.

For providing a per service level failover, a node activating certain 
non-revertive services needs to indicate that it is Active ONLY for those 
non-revertive services. This can be done by using a unique bitmap where each 
bit position is uniquely identifying a service. This unique bitmap is 
configured on all nodes by a network controller. When there is at least one 
non-revertive service for which a node is not active AND it is active for at 
least 1 non-revertive service, this node will set bits identifying the active 
services in the bitmap and send it in the payload of the BFD packet.


Thanks,
--Ankur

Re: Service Redundancy using BFD

Reply via email to