On Tue, May 29, 2012 at 04:21:12PM +0000, Matt Hamilton wrote: > I will happily supply what I can. Just let me know how.
Hello, I've never used BGPd personally but perhaps I can help you get a backtrace. There is quite possibly two ways to get a backtrace. 1. Make BGPD dump core Recompile the bgpd with debugging symbols (CFLAGS+=-g, LDFLAGS+=-g). And install that. Check the directory of the _bgpd user and make the directory writeable for the _bgpd user. If after another crash a bgpd.core file pops up you got it. You can test this by sending bgpd a SIGABRT and if it didn't core something is wrong, see #2. You then type 'gdb /usr/sbin/bgpd bgpd.core' and type backtrace within gdb. Type quit to exit gdb. Keep the bgpd.core file around by saving it to another location as it should overwrite with each subsequent segfault. 2. Attach gdb to the process and wait Recompile the bgpd with debugging symbols (CFLAGS+=-g, LDFLAGS+=-g). And install that. su to root, tmux the session and from within tmux attach to the bgpd process "gdb /usr/sbin/bgpd <pid of bgpd>" once you're attached bgpd will cease running temporarily, just type "continue" (make sure you don't set any breakpoints). You can now wait until bgpd crashes on signal 11. gdb will break back to the debugger command line and you can type backtrace within gdb. Type quit to exit gdb. When you get to it when it crashed you can attach to the tmux session with "tmux att -d" and have before you the gdb command line. Even better than just a backtrace is going up and down the stack to see where the program crashed. Google for gdb commands. 3. Ask someone else who may have better Ideas. > Although as you said in another post > it is hard to replicate. All I seem to be able to see is that this happens > during some period of network instability. It seems that there is a > ripple affect that something happens and that then causes a bgpd > process to die which then propagates more changes to iBGP peers > and they then sometimes die as well. > > -Matt Cheers, -peter