Thank you for the information on debugging, although I am not sure that I will know what to do with the output once I get them. I'm also not positive that the program is actually "crashing" either.
Here is what is going on network-wise which I think may be contributing to the problem. I can align the crash with an entry to one LDP neighbor disappearing from the ARP table: Host Ethernet Address Netif Expire Flags 100.92.64.68 00:b7:71:03:32:95 hvn0 17m46s When that expire counter hits Zero, I am not able to ping the neighbor, and then my LDP session drops and the process quits with code 10. I ran a packet capture of this and I didn't get an ARP request going out once I hit Zero for some time, and after it was too late and LDP had already stopped. I have verified the TCAM in each switch in between and both my hvn0 interface MAC as well as the ASR920 MAC are listed with the correct ports. So I don't believe that this is the underlying network causing the issue. I've just configured a Static, permanent ARP entry so we'll see what happens with that, but thats more of a workaround instead of a fix. I may also configure a new VM on -current just to see if that makes any difference. On Sun, Mar 3, 2019 at 6:02 AM Stuart Henderson <s...@spacehopper.org> wrote: > > On 2019-03-03, Henry Bonath <he...@thebonaths.com> wrote: > > To elaborate, after enabling ldpd with -vvvv I have observed the > > following in the log output: > > > > Mar 3 00:58:37 mpls-gw ldpd[77048]: nbr_fsm: event SESSION CLOSE > > resulted in action CLOSE SESSION and changing state for lsr-id > > 100.92.64.68 from OPERATIONAL to PRESENT > > Mar 3 00:58:37 mpls-gw ldpd[77048]: session_close: closing session > > with lsr-id 100.92.64.68 > > Mar 3 00:58:37 mpls-gw ldpd[9902]: label decision engine exiting > > Mar 3 00:58:37 mpls-gw ldpd[54245]: kernel routing table decoupled > > Mar 3 00:58:37 mpls-gw ldpd[54245]: waiting for children to terminate > > Mar 3 00:58:37 mpls-gw ldpd[54245]: ldp engine terminated; signal 10 > > Mar 3 00:58:37 mpls-gw ldpd[54245]: terminating > > A backtrace (preferably from a copy of ldpd built with debug symbols) > would likely be helpful. > > If you don't already have the source tree checked out you can fetch > just ldpd: > > $ cvs -d anon...@anoncvs.openbsd.org:/cvs get -P -r OPENBSD_6_4 > src/usr.sbin/ldpd > $ cd src/usr.sbin/ldpd > $ make obj && make clean && make DEBUG=-g > > Enable coredumps for priv-dropped processes, as shown in sysctl(8): > > # mkdir -m 700 /var/crash/ldpd > # sysctl kern.nosuidcoredump=3 > > Then: > > # obj/ldpd -vvvvvd > <wait for crash> > # ls /var/crash/ldpd > 12345.core > # gdb obj/ldpd /var/crash/ldpd/12345.core > (gdb) bt full > >