On Wed, Nov 25, 2015 at 05:08:27PM +0100, Thorleif Wiik [BCIX] wrote: > Hi, > > OpenBGPd on OpenBSD 5.8 (with all patches applied) is crashing during > startup. > > On a second box with 5.7 and the same hardware/configuration there are no > problems. > OpenBGPd is configured as route-server with 118 v4/v6 peers and about 35300 > IPv4 > and 14800 IPv6 routes. > > > Any tips for configuration changes to prevent this on 5.8?
Something in the session engine corrupted some memory, now the question is what. Is it possible to get a backtrace of the session engine? See sysctl(8) at the bottom on how to use kern.nosuidcoredump=3 to get a core file. Wonder if the SE is printing something before it explodes. Is it possible to get more of the log? The poll fd errors are a red herring because this is a case where errno is not previously set and so it should not print it. See diff at the end of this mail. > Nov 25 13:41:41 route-server bgpd[22856]: startup > Nov 25 13:41:41 route-server bgpd[22856]: rereading config > Nov 25 13:41:41 route-server bgpd[30006]: route decision engine ready > Nov 25 13:43:34 route-server bgpd[30006]: RDE reconfigured > > .. many many prefixes > > Nov 25 13:45:45 route-server bgpd[30006]: handle_pollfd: poll fd: No buffer > space available > Nov 25 13:45:45 route-server bgpd[30006]: RDE: Lost connection to SE > Nov 25 13:45:46 route-server bgpd[30006]: handle_pollfd: poll fd: No buffer > space available > Nov 25 13:45:46 route-server bgpd[30006]: RDE: Lost connection to SE control > Nov 25 13:45:46 route-server bgpd[22856]: handle_pollfd: poll fd: Invalid > argument > Nov 25 13:45:46 route-server bgpd[22856]: main: Lost connection to SE > Nov 25 13:45:46 route-server bgpd[22856]: Lost child: session engine > terminated; signal 11 > Nov 25 13:45:46 route-server bgpd[30006]: route decision engine exiting > > > > Thanks, Thorleif > > > -- > Thorleif Wiik, CTO > thorleif.w...@bcix.de > > Tel: +49 160 90378641 > > BCIX Management GmbH / BCIX e.V. > Stromstrasse 5 > 10555 Berlin - Germany > > http://www.bcix.de/ > https://twitter.com/bcix <http://twitter.com/bcix> > https://www.facebook.com/BCIX.Internet.Exchange > -- :wq Claudio Index: bgpd.c =================================================================== RCS file: /cvs/src/usr.sbin/bgpd/bgpd.c,v retrieving revision 1.182 diff -u -p -r1.182 bgpd.c --- bgpd.c 20 Nov 2015 23:26:08 -0000 1.182 +++ bgpd.c 25 Nov 2015 20:47:34 -0000 @@ -903,21 +903,21 @@ handle_pollfd(struct pollfd *pfd, struct if (pfd->revents & POLLOUT) if (msgbuf_write(&i->w) <= 0 && errno != EAGAIN) { - log_warn("handle_pollfd: msgbuf_write error"); + log_warn("imsg write error"); close(i->fd); i->fd = -1; return (-1); } if (pfd->revents & POLLIN) { - if ((n = imsg_read(i)) == -1) { - log_warn("handle_pollfd: imsg_read error"); + if ((n = imsg_read(i)) == -1 && errno != EAGAIN) { + log_warn("imsg read error"); close(i->fd); i->fd = -1; return (-1); } - if (n == 0) { /* connection closed */ - log_warn("handle_pollfd: poll fd"); + if (n == 0) { + log_warnx("peer closed imsg connection"); close(i->fd); i->fd = -1; return (-1);