On 2018-09-18, Claudio Jeker <cje...@diehard.n-r-g.com> wrote:
> On Tue, Sep 18, 2018 at 05:11:24AM +0100, Tom Smyth wrote:
>> Hello all,
>> I was wondering what is the lowest values of BGP holdtime that you
>> recommend running in production ?
>
> I recomend using the default especially against ebgp peers.

MikroTik in particular are known to be bad at keeping up with BGP timers.

>> I would like to set them to a lower value to detect an issue with
>> peers that dont support BFD  quicker,
>> but I dont want to set it to a value that would overly tax the system 
>> resources,
>> 
>> If you are running approx 60 Peers on one and 30 Peers on another router,
>> 
>> Im also running Arista 7050 Switches with BGP sessions  to the OpenBGPd 
>> Routers.
>> 
>> I would really apprecate any one elses real world experience on this
>> matter before I go lowering the default values in our production
>> enviornment
> 
> bgpd should be able to handle the minimal hold time with 30 or 60
> peers just fine but I'm not so sure about any other system. Also flaping
> sessions because of too aggressive holdtime is counterproductive the
> session flap dampening will kick in and will keep session longer down than
> needed.
>
> In the end, like with most tuning, you need to check for yourself with what
> you are comfortable with.

This is mostly down to what your peers can handle (at a particular time),
and other people's real world experience will mostly not reflect that.

You might think to check "bgpctl sh nei" over time and monitor how "Last
read" compares with "keepalive interval" to get a baseline, but if you do
then beware, that will mostly just show things under a normal situation.
If hold times expire because somebody's router is too busy on occasion,
flapping the session is just going to make it *even more* busy, adding
to the problem (which can be especially nasty at an IXP).

Are you seeing actual problems with peers that cause you to want to do
this?

- If so and it's IXP-wide, maybe talk to the IXP? If it happens during
maintenance and they aren't already following BCP214 (session culling),
perhaps they could do that.

- If so and it's individual peers, maybe consider dropping them if
they're unreliable and not that important, or talking to them if they
are important?


Reply via email to