* Matt Hamilton <ma...@netsight.co.uk> [2012-05-29 12:02]: > Stuart Henderson <stu <at> spacehopper.org> writes: > > cron job to restart it, with a random delay to avoid two machines > > coming back up at the same time when all the routers at a site > > fail together... > So you just check it every minute to see if it is alive? > > It seems to me to be a pretty fundamental design flaw in the software given > its role. I would expect it to return sending a packet or something, not > just exit.
it doesn't exit under normal circumstances. bgpd is used in a lot of places, some extremely large ones too. you'd be surprised. and no, they dont deal with "bgpd exiting constantly" or however you called it, not at all. > > > The first message below seems to indicate unable to allocate > > > memory. I'm running these boxes pretty much stock having not tuned any > > > parameters at all. Both are just running routing daemons (bgpd, ospf) > > > and the 4.3 box is running OpenVPN. There are no applications running > > > and both boxes have plenty of RAM (4GB) and not using any swap or > > > anything. > > > > > > Is there something I should look at tuning in terms > > > of memory allocation in order to stop this happening? > > > > Make sure login.conf memory limits for the daemon class (or the > > _bgpd class on a newer OS version using /etc/rc.d) are high enough. > > If your limits are insufficient for the size of routing table then > > obviously you will have a problem. But also there is a bug > > somewhere, possibly to do with nexthop changes, which can result > > in very rapidly increasing memory use. this bug is hard to trigger and we have not been able to identify a pattern here, except that it involves iBGP. -- Henning Brauer, h...@bsws.de, henn...@openbsd.org BS Web Services, http://bsws.de, Full-Service ISP Secure Hosting, Mail and DNS Services. Dedicated Servers, Root to Fully Managed Henning Brauer Consulting, http://henningbrauer.com/