Hi all,
Just to make sure nobody's sitting and wondering what happened with this
thread, then here's a final mail with a short description of what's
cooking right now and what was boiling back then.
Below you'll find:
- case
- situation
- conclusion
- physical connection
- hardware
- a few tips
######
Case: #
######
When I added another bgp peer to my router the overall network/routing
performance on the server was brought to an almost staggering halt until
I downed the bgp session again.
##########
Situation: #
#########
At first I had warp-speed on the wire and all tests on the connection
(*) seemed okay.
Trivialities like speed-, duplex-, mtu settings etc. was agreed upon
before the connections was established.
The time elapsed from initiating the BGP session to severe performance
degradation was <2 minutes and if I did not down the BGP session within
the next minute (literally) then routing and network performance would
drop like a piano out of the sky. In short I was using all mbuf (Kbytes
allocated to network >97%).
Raising kern.maxclusters stepwise gave me a short lived break until I
reached a given point (see tips below). Above that I gained nothing and
stopped raising it any further.
The new carrier had a lot of alignment errors (CRC/FCS) and packet size
problems (Jabbers/rxOversizedPkts) in their log / on "their" side. We
both had heavy packelosses after these few minutes.
'tcpdump' did not reveal any significant signs of a sick connection on
my side.
A lot of testing has been done since. The connections however, is still
not running but adjustments on the peers side and replacements on the
connection itself has raised the "panic-threshold" from <2min. to around
18min. before disaster strikes.
############
Conclusion: #
###########
I'll receive a fiber directly to my front door from the new peer shortly
i.e. we'll bypass the copper-fiber-copper connection. I don't like not
being able to pinpoint the problem before moving on, but I have no way
of seeing what's going on on the "other side". I have an idea that the
Cisco box and the converters do not like each other, but again it's only
a guess.
What I do know is that an error-prone connection combined with a well
connected BGP peer, can jeopardize an entire bgp routers performance.
BGP can not "see" how well the connection is runing - it can only see
link and link = traffic = congestion.
I can not claim to have found the 'holy grale' in BGP troubleshooting
but I can rightfully claim that I've eliminated my OpenBGPD as source of
error (both as i386 and amd64) and I can also rightfully claim to have
found a few settings that actually makes a difference.
If the carrier find the problem and inform me, I will of course inform
all of you as well.
##################
Physical connection: #
##################
We are terminating with this carrier in a FE port but due to the
distance between them and us at the datacenter location, a FDDI
connection was placed in between like:
[our
router]----[100baseTX]----[IMC**]----//..fiber..//----[IMC**]----[100baseTX]----[switch
integrated in a Cisco 7200 iron]----[Cisco iron itself/router]
* Attenuation on the FDDI part was 1.2db respectively 1.3db which is not
brilliant, but okay. More importantly it's within the specifications of
the IMC's.
** (IMC = MOXA Industrial Media Converter 101 a.k.a. IMC-101 for both
Single- and Multi mode / SC connectors. We even replaced these with MOXA
EDS-208-M-SC (larger model) as well).
All Cat6 STP cables has been replaced more than once and the fiber once.
##########
Hardware: #
##########
My OpenBGPD setup is plain-vanilla with 4 BGP peers, one eBGP peer and
two public networks on the inside (700+ servers).
The BGP box I have (OpenBSD 3.9 -stable / amd64 / bsd.mp) is a
"serverworks" based box with 2GB of ram per cpu, Intel PRO/1000MT dual
and quard server nic's, U320 SCSI etc., etc. -> i.e. this is not about
exhaustion due to inferior or inadequate hardware.
My network performance related sysctl settings:
net.inet.ip.ifq.maxlen=250
kern.maxclusters=32768 (this has been tested stepwise (~6500 at a
time) from the std. setting [6144] and up)
Note_0: normally I run this on a i386 Xeon based box with 4GB of ram,
but the box is down for upgrade/maintenance, hence the temporary amd64 arch.
Note_1: the new boxes I'm building has a 64-bit Xeon cpu, 2GB of ram,
Syskonnect nics and i386 as arch.
###########
A few tips: #
##########
The tips I've put below are all "confirmed successes" and a mixture of
experience, what I've been told by Henning/Claudio and what I've seen on
this list (some of the sysctl settings).
The important thing is that they actually work.
0 - run busy BGP routers on i386 compared to amd64
1 - run busy BGP routers on [serverworks based] single cpu systems.
2 - run busy BGP routers on 2GB of memory at the most.
On a healthy box going from 4 GB of ram to 2GB gives a drop on almost
20% in 'Kbytes allocated to network'.
>2GB of ram is in fact counterproductive ..!
3 - work carefully with kern.maxclusters - don't just raise it.
The effect actually differs between arch.
Hope you can use some of it.
The best to your all,
/per
[EMAIL PROTECTED]