Re: Tuning BFD session times

Ashesh Mishra Sun, 01 Apr 2018 10:18:06 -0700

Hi Sasha,


There are two scenarios here and they depend on whether the satellite is in 
geo-stationary orbit (GEO) or non-geo-stationary orbit (NGSO).


Scenario-1: Non-Geostationary Satellites: This is the scenario that you 
described. Satellites in Middle Earth Orbit (MEO) or Low Earth Orbit (LEO) move 
relative to the earth and hence, their distance from the ground terminals 
varies as they pass over a given location. This results in varying RTT 
(sometimes by as much as 30ms). The issue in this scenario is not necessarily 
that the BFD detect interval must change frequently but that it's difficult to 
accurately select the intervals as the RTT depends on the location of the 
terminal and the gateway (and this gets quite complex). If the session can 
automatically decide the interval, then the complexity in starting a new 
service is reduced. Another complicating factor is when the terminal moves 
(ship or aircraft, for example) as this increases the variance of the RTT. We 
typically set the intervals to a high enough level but that affects the 
performance. We see the same varying RTT in GEO when the terminal is mobile but 
the percentage change is much smaller than the overall RTT of GEO (because GEO 
satellites are much farther away from the earth at ~36,000kms vs MEO at 
~8,000kms and LEO at ~200-1000kms).


Scenario-2: Low-latency link backed up by high-latency link: In this case a GEO 
satellite backs up NGSO-based connection or fiber (or other terrestrial 
wired/wireless WAN options). The end-to-end service then has very different RTT 
when the primary is active versus when the backup is active. The typical 
solution is to base timers on the backup RTT, which is very inefficient.


Regards,

Ashesh

________________________________
From: Alexander Vainshtein <alexander.vainsht...@ecitele.com>
Sent: Sunday, April 1, 2018 9:28:43 AM
To: Ashesh Mishra
Cc: Jeffrey Haas; rtg-bfd@ietf.org
Subject: RE: Tuning BFD session times


Ashesh,

I would like to understand better the use case with satellite links that you 
have described.

In particular, can you please explain why long RTT affects the BFD detection 
times?

As I see it, what could really affect these times is variable delay introduced 
in some cases by the satellite links since the distance between the satellite 
and the terrestrial antennae may change significantly with time.



What did I miss?



Regards,

Sasha



Office: +972-39266302

Cell:      +972-549266302

Email:   alexander.vainsht...@ecitele.com



From: Rtg-bfd [mailto:rtg-bfd-boun...@ietf.org] On Behalf Of Ashesh Mishra
Sent: Sunday, April 1, 2018 5:54 PM
To: Jeffrey Haas <jh...@pfrc.org>; rtg-bfd@ietf.org
Subject: Re: Tuning BFD session times



Jeff, thanks for kicking-off this discussion on the list!



One additional comment that I wanted to make was around automation. There were 
questions during the meeting around the need for auto-tuning and that the 
process of determining the interval can/should be manual.



The automation of control in all aspects of dynamic behavior is a priority for 
network operators. When configuring manually, parameters such BFD intervals are 
typically set at very conservative values because human latency is very high 
when responding to changing network conditions. Manual configuration also takes 
a lot of time and is accounts for significant number of lost opportunities and 
value for operators.



[JH] "applications should generally choose a detection interval that is 
reasoanble for their application rather than try to dynamically discover the 
lowest possible stable detection interval. "

[AM] This depends on the use-case. From the point-of-view of a service provider 
that delivers long-haul connectivity (typical scenario in which the link 
characteristics have large variance) then the intent is to provide the best 
performance. As such providers deliver connectivity to critical applications, 
and are often the only way of delivering connectivity in such places, the 
ability to tune the system to deliver an up-time that is superior drives 
significant value. Consider a scenario where there is a 130ms RTT link (MEO 
satellite, LEO will be in the 20-60ms range) and its backup is a 600ms RTT link 
(GEO satellite), and are being used to deliver transit connectivity. The rate 
at which the end-to-end service can run BFD is significantly faster when MEO is 
active versus when GEO is active. The application, in this scenario, may 
survive the RTT, but the business continuity is critical in many cases. Since 
the provider of long-haul can not control the application, it must provide the 
best possible failover performance.



[JH] "1. BFD is asymmetric..  This means a receiving BFD implementation must 
provide feedback to a sending implementation in order for it to understand 
perceived reliability."

[AM] May not need to be the BFD implementation providing the feedback if there 
are other performance mechanisms running. The challenge is to standardize the 
mechanism that BFD can use (if the measurement is not self-contained in BFD). 
You're right in pointing out the challenge in accounting for the CPU delays and 
that was the reason for the original proposal for BFD performance measurement. 
If the measurement is within the BFD realm, it will account for the CPU delays. 
However, most good BFD engines have relatively deterministic performance and 
are quite optimized so the variance with scale and time is not significant (but 
I concede that not all BFD implementations are good).



[JH] "2. Measurement infrastructure may negatively impact session scale.  Greg, 
I believe, made this point when discussing host processing issues vs. BFD 
ingress/egress."

[AM] This is an issue if using a measurement mechanism within BFD (other 
performance measurement methods are always running in network for SLA reporting 
and/or network optimization). Within a metro-area with fiber or terrestrial 
wireless (microwave, LTE, etc.) connectivity, I would likely not need constant 
auto-tuning. The variance in the primary and backup links in such network will 
not be significant to affect the BFD parameters. In long-haul links, this may 
be a valuable feature in which case, the additional overhead may be justified. 
So it depends on the use-case whether continuous auto-tuning is required or if 
it is one-time.



[JH] "3. Detection interval calculations really need to take into account 
things that are greater than simple packet transmission times.  As an example, 
if your measurement is always taken during low system CPU or network activity, 
how high is your confidence about the interval?  What about scaling vs. number 
of total BFD sessions?"

[AM] Great questions. Typically when running BFD or CFM (or similar) high 
frequency OAM, CPU peaks should not affect the OAM performance (a variety of 
methods, based on the system on which OAM is running, can ensure that). CPU 
peaks become a bigger issue if BFD is used to detect continuity for a 
particular flow (or QoS).



--

Asheh

________________________________

From: Rtg-bfd <rtg-bfd-boun...@ietf.org<mailto:rtg-bfd-boun...@ietf.org>> on 
behalf of Jeffrey Haas <jh...@pfrc.org<mailto:jh...@pfrc.org>>
Sent: Wednesday, March 28, 2018 11:49 AM
To: rtg-bfd@ietf.org<mailto:rtg-bfd@ietf.org>
Subject: Tuning BFD session times



Working Group,

We had very active discussion (yay!) at the microphone as part of Mahesh's
presentation on BFD Performance Measurement.
(draft-am-bfd-performance)

I wanted to start this thread to discuss the greater underlying issues this
discussion raised.  In particular, active tuning of BFD session parameters.
Please note that opinions I state here are as an individual contributor.

BFD clients typically want the fastest, most stable detection interval that
is appropriate to their application.  That stability component is very
important since too aggressive of timers can result in unnecessary BFD
session instability which will impact the subscribing application.  Such
stability is a function of many things, scale of the system running BFD
being a major one.

In my opinion, applications should generally choose a detection interval
that is reasoanble for their application rather than try to dynamically
discover the lowest possible stable detection interval.  This is because a
number of unstable factors, such as CPU load, contention with other network
traffic and other things that are outside the general control of many
sytems may impact such scale.

That said, here's a few thoughts on active feedback mechanisms:
1. BFD is asymmetric.  This means a receiving BFD implementation must provide
   feedback to a sending implementation in order for it to understand
   perceived reliability.
2. Measurement infrastructure may negatively impact session scale.  Greg, I
   believe, made this point when discussing host processing issues vs. BFD
   ingress/egress.
3. Detection interval calculations really need to take into account things
   that are greater than simple packet transmission times.  As an example,
   if your measurement is always taken during low system CPU or network
   activity, how high is your confidence about the interval?  What about
   scaling vs. number of total BFD sessions?

I have no strong conclusions here, just some cautionary thoughts.

What are yours?

-- Jeff

___________________________________________________________________________

This e-mail message is intended for the recipient only and contains information 
which is
CONFIDENTIAL and which may be proprietary to ECI Telecom. If you have received 
this
transmission in error, please inform us by e-mail, phone or fax, and then 
delete the original
and all copies thereof.
___________________________________________________________________________

Re: Tuning BFD session times

Reply via email to