We use LibreNMS and smokeping to monitor latency and dropped packets on all our 
links and setup alerts if they go over a certain threshold. We are working on a 
script to automatically reroute traffic based on the alerts to route around the 
bad link to give us time to fix it.

Thanks
Travis

From: NANOG <nanog-bounces+tgarrison=netviscom....@nanog.org> On Behalf Of 
Baldur Norddahl
Sent: Thursday, April 29, 2021 3:39 PM
To: nanog@nanog.org
Subject: link monitoring

Hello

We had a 100G link that started to misbehave and caused the customers to notice 
bad packet loss. The optical values are just fine but we had packet loss and 
latency. Interface shows FEC errors on one end and carrier transitions on the 
other end. But otherwise the link would stay up and our monitor system 
completely failed to warn about the failure. Had to find the bad link by 
traceroute (mtr) and observe where packet loss started.

The link was between a Juniper MX204 and Juniper ACX5448. Link length 2 meters 
using 2 km single mode SFP modules.

What is the best practice to monitor links to avoid this scenarium? What 
options do we have to do link monitoring? I am investigating BFD but I am 
unsure if that would have helped the situation.

Thanks,

Baldur


Reply via email to