On Wed, Nov 25, 2015 at 05:08:27PM +0100, Thorleif Wiik [BCIX] wrote:
> Hi,
> 
> OpenBGPd on OpenBSD  5.8 (with all patches applied) is crashing during
> startup.
> 
> On a second box with 5.7 and the same hardware/configuration there are no
> problems.
> OpenBGPd is configured as route-server with 118 v4/v6 peers and about 35300
> IPv4
> and 14800 IPv6 routes.
> 
> 
> Any tips for configuration changes to prevent this on 5.8?

Something in the session engine corrupted some memory, now the question is
what.  Is it possible to get a backtrace of the session engine?
See sysctl(8) at the bottom on how to use kern.nosuidcoredump=3 to get a
core file.

Wonder if the SE is printing something before it explodes. Is it possible
to get more of the log?

The poll fd errors are a red herring because this is a case where errno is
not previously set and so it should not print it. See diff at the end of
this mail.
 
> Nov 25 13:41:41 route-server bgpd[22856]: startup
> Nov 25 13:41:41 route-server bgpd[22856]: rereading config
> Nov 25 13:41:41 route-server bgpd[30006]: route decision engine ready
> Nov 25 13:43:34 route-server bgpd[30006]: RDE reconfigured
> 
> .. many many prefixes
> 
> Nov 25 13:45:45 route-server bgpd[30006]: handle_pollfd: poll fd: No buffer
> space available
> Nov 25 13:45:45 route-server bgpd[30006]: RDE: Lost connection to SE
> Nov 25 13:45:46 route-server bgpd[30006]: handle_pollfd: poll fd: No buffer
> space available
> Nov 25 13:45:46 route-server bgpd[30006]: RDE: Lost connection to SE control
> Nov 25 13:45:46 route-server bgpd[22856]: handle_pollfd: poll fd: Invalid
> argument
> Nov 25 13:45:46 route-server bgpd[22856]: main: Lost connection to SE
> Nov 25 13:45:46 route-server bgpd[22856]: Lost child: session engine
> terminated; signal 11
> Nov 25 13:45:46 route-server bgpd[30006]: route decision engine exiting
> 
> 
> 
> Thanks, Thorleif
> 
> 
> -- 
> Thorleif Wiik, CTO
> thorleif.w...@bcix.de
> 
>  Tel: +49 160 90378641
> 
> BCIX Management GmbH / BCIX e.V.
> Stromstrasse 5
> 10555 Berlin - Germany
> 
> http://www.bcix.de/
> https://twitter.com/bcix <http://twitter.com/bcix>
> https://www.facebook.com/BCIX.Internet.Exchange
> 

-- 
:wq Claudio


Index: bgpd.c
===================================================================
RCS file: /cvs/src/usr.sbin/bgpd/bgpd.c,v
retrieving revision 1.182
diff -u -p -r1.182 bgpd.c
--- bgpd.c      20 Nov 2015 23:26:08 -0000      1.182
+++ bgpd.c      25 Nov 2015 20:47:34 -0000
@@ -903,21 +903,21 @@ handle_pollfd(struct pollfd *pfd, struct
 
        if (pfd->revents & POLLOUT)
                if (msgbuf_write(&i->w) <= 0 && errno != EAGAIN) {
-                       log_warn("handle_pollfd: msgbuf_write error");
+                       log_warn("imsg write error");
                        close(i->fd);
                        i->fd = -1;
                        return (-1);
                }
 
        if (pfd->revents & POLLIN) {
-               if ((n = imsg_read(i)) == -1) {
-                       log_warn("handle_pollfd: imsg_read error");
+               if ((n = imsg_read(i)) == -1 && errno != EAGAIN) {
+                       log_warn("imsg read error");
                        close(i->fd);
                        i->fd = -1;
                        return (-1);
                }
-               if (n == 0) { /* connection closed */
-                       log_warn("handle_pollfd: poll fd");
+               if (n == 0) {
+                       log_warnx("peer closed imsg connection");
                        close(i->fd);
                        i->fd = -1;
                        return (-1);

Reply via email to