Hi all,

Just to make sure nobody's sitting and wondering what happened with this thread, then here's a final mail with a short description of what's cooking right now and what was boiling back then.

Below you'll find:
- case
- situation
- conclusion
- physical connection
- hardware
- a few tips


######
Case: #
######
When I added another bgp peer to my router the overall network/routing performance on the server was brought to an almost staggering halt until I downed the bgp session again.


##########
Situation: #
#########
At first I had warp-speed on the wire and all tests on the connection (*) seemed okay. Trivialities like speed-, duplex-, mtu settings etc. was agreed upon before the connections was established. The time elapsed from initiating the BGP session to severe performance degradation was <2 minutes and if I did not down the BGP session within the next minute (literally) then routing and network performance would drop like a piano out of the sky. In short I was using all mbuf (Kbytes allocated to network >97%). Raising kern.maxclusters stepwise gave me a short lived break until I reached a given point (see tips below). Above that I gained nothing and stopped raising it any further.

The new carrier had a lot of alignment errors (CRC/FCS) and packet size problems (Jabbers/rxOversizedPkts) in their log / on "their" side. We both had heavy packelosses after these few minutes. 'tcpdump' did not reveal any significant signs of a sick connection on my side. A lot of testing has been done since. The connections however, is still not running but adjustments on the peers side and replacements on the connection itself has raised the "panic-threshold" from <2min. to around 18min. before disaster strikes.


############
Conclusion: #
###########
I'll receive a fiber directly to my front door from the new peer shortly i.e. we'll bypass the copper-fiber-copper connection. I don't like not being able to pinpoint the problem before moving on, but I have no way of seeing what's going on on the "other side". I have an idea that the Cisco box and the converters do not like each other, but again it's only a guess.

What I do know is that an error-prone connection combined with a well connected BGP peer, can jeopardize an entire bgp routers performance. BGP can not "see" how well the connection is runing - it can only see link and link = traffic = congestion.

I can not claim to have found the 'holy grale' in BGP troubleshooting but I can rightfully claim that I've eliminated my OpenBGPD as source of error (both as i386 and amd64) and I can also rightfully claim to have found a few settings that actually makes a difference. If the carrier find the problem and inform me, I will of course inform all of you as well.



##################
Physical connection: #
##################
We are terminating with this carrier in a FE port but due to the distance between them and us at the datacenter location, a FDDI connection was placed in between like:

[our router]----[100baseTX]----[IMC**]----//..fiber..//----[IMC**]----[100baseTX]----[switch integrated in a Cisco 7200 iron]----[Cisco iron itself/router]

* Attenuation on the FDDI part was 1.2db respectively 1.3db which is not brilliant, but okay. More importantly it's within the specifications of the IMC's.

** (IMC = MOXA Industrial Media Converter 101 a.k.a. IMC-101 for both Single- and Multi mode / SC connectors. We even replaced these with MOXA EDS-208-M-SC (larger model) as well).

All Cat6 STP cables has been replaced more than once and the fiber once.


##########
Hardware: #
##########
My OpenBGPD setup is plain-vanilla with 4 BGP peers, one eBGP peer and two public networks on the inside (700+ servers). The BGP box I have (OpenBSD 3.9 -stable / amd64 / bsd.mp) is a "serverworks" based box with 2GB of ram per cpu, Intel PRO/1000MT dual and quard server nic's, U320 SCSI etc., etc. -> i.e. this is not about exhaustion due to inferior or inadequate hardware.
My network performance related sysctl settings:
net.inet.ip.ifq.maxlen=250
kern.maxclusters=32768 (this has been tested stepwise (~6500 at a time) from the std. setting [6144] and up)

Note_0: normally I run this on a i386 Xeon based box with 4GB of ram, but the box is down for upgrade/maintenance, hence the temporary amd64 arch.

Note_1: the new boxes I'm building has a 64-bit Xeon cpu, 2GB of ram, Syskonnect nics and i386 as arch.


###########
A few tips: #
##########
The tips I've put below are all "confirmed successes" and a mixture of experience, what I've been told by Henning/Claudio and what I've seen on this list (some of the sysctl settings).
The important thing is that they actually work.

0 - run busy BGP routers on i386 compared to amd64

1 - run busy BGP routers on [serverworks based] single cpu systems.

2 - run busy BGP routers on 2GB of memory at the most.
On a healthy box going from 4 GB of ram to 2GB gives a drop on almost 20% in 'Kbytes allocated to network'.
>2GB of ram is in fact counterproductive ..!

3 - work carefully with kern.maxclusters - don't just raise it.
The effect actually differs between arch.




Hope you can use some of it.
The best to your all,

/per
[EMAIL PROTECTED]

Reply via email to