I've been taking a look on the machine, the EAGAIN is not from the
SE<>RDE socketpair, but the TCP socket, I think there was some confusion
between ktrace from one run and fstat from another before.

When pushing routes out to the neighbour things stop progressing:

Neighbor                   AS    MsgRcvd    MsgSent  OutQ Up/Down  State/PrfRcvd
91.232.181.40           50419         23      74541 73637 00:00:22      1

Proto   Recv-Q Send-Q Recv-W Send-W Cgst-W  Local Address          Foreign 
Address        (state)
tcp          0  20124  17028  14080   1548  91.232.181.2.179       
91.232.181.40.52546    ESTABLISHED

It looks like there is an mtu mismatch, tgreer's router is set to
1600 and the observed MSS in syns both sides matches this, however
large packets (e.g.  ping -Ds 1520) don't actually make it through.
I am pretty sure this is why the tcp session isn't progressing, though
even so it's distinctly sub-optimal for bgpd to hang because of it. ;)

Attaching to a "stuck" SE we have this:

(gdb) bt
#0  0x00000316b86042ba in sendmsg () at <stdin>:2
#1  0x00000316b2ab430f in msgbuf_write (msgbuf=0x316a9654508) at 
/usr/src/lib/libutil/imsg-buffer.c:261
#2  0x00000314a8e0a104 in change_state (peer=0x316a9654000, state=STATE_IDLE, 
event=EVNT_RCVD_NOTIFICATION)
    at /usr/src/usr.sbin/bgpd/session.c:938
#3  0x00000314a8e09f0b in bgp_fsm (peer=0x316a9654000, 
event=EVNT_RCVD_NOTIFICATION)
    at /usr/src/usr.sbin/bgpd/session.c:880
#4  0x00000314a8e0c4e0 in session_process_msg (p=0x316a9654000) at 
/usr/src/usr.sbin/bgpd/session.c:1868
#5  0x00000314a8e09279 in session_main (pipe_m2s=0x7f7ffffee5c0, 
pipe_s2r=0x7f7ffffee5a0,
    pipe_m2r=0x7f7ffffee5b0, pipe_s2rctl=0x7f7ffffee590) at 
/usr/src/usr.sbin/bgpd/session.c:572
#6  0x00000314a8e062a2 in main (argc=0, argv=0x7f7ffffee670) at 
/usr/src/usr.sbin/bgpd/bgpd.c:214

So the code in change_state() says this:

                /*
                 * try to write out what's buffered (maybe a notification),
                 * don't bother if it fails
                 */
                if (peer->state >= STATE_OPENSENT && peer->wbuf.queued)
                        msgbuf_write(&peer->wbuf);

The old imsg code in bgpd did this:

-       if ((n = sendmsg(msgbuf->fd, &msg, 0)) == -1) {
-               if (errno == EAGAIN || errno == ENOBUFS ||
-                   errno == EINTR)     /* try later */
-                       return (0);
-               else
-                       return (-1);
-       }

Whereas the new code since it was split into libutil...

again:
        if ((n = sendmsg(msgbuf->fd, &msg, 0)) == -1) {
                if (errno == EAGAIN || errno == EINTR)
                        goto again;
                if (errno == ENOBUFS)
                        errno = EAGAIN;
                return (-1);
        }

...retries automatically, causing the hang.

Reply via email to