Thank you for the information on debugging, although I am not sure
that I will know what to do with the output once I get them.
I'm also not positive that the program is actually "crashing" either.

Here is what is going on network-wise which I think may be
contributing to the problem.

I can align the crash with an entry to one LDP neighbor disappearing
from the ARP table:
Host                                 Ethernet Address    Netif Expire    Flags
100.92.64.68                         00:b7:71:03:32:95    hvn0 17m46s

When that expire counter hits Zero, I am not able to ping the
neighbor, and then my
LDP session drops and the process quits with code 10.
I ran a packet capture of this and I didn't get an ARP request going
out once I hit Zero for some time, and after it was too late and LDP
had already stopped.

I have verified the TCAM in each switch in between and both my hvn0
interface MAC as well as the ASR920 MAC are listed with the correct
ports.
So I don't believe that this is the underlying network causing the issue.

I've just configured a Static, permanent ARP entry so we'll see what
happens with that, but thats more of a workaround instead of a fix.
I may also configure a new VM on -current just to see if that makes
any difference.



On Sun, Mar 3, 2019 at 6:02 AM Stuart Henderson <s...@spacehopper.org> wrote:
>
> On 2019-03-03, Henry Bonath <he...@thebonaths.com> wrote:
> > To elaborate, after enabling ldpd with -vvvv I have observed the
> > following in the log output:
> >
> > Mar  3 00:58:37 mpls-gw ldpd[77048]: nbr_fsm: event SESSION CLOSE
> > resulted in action CLOSE SESSION and changing state for lsr-id
> > 100.92.64.68 from OPERATIONAL to PRESENT
> > Mar  3 00:58:37 mpls-gw ldpd[77048]: session_close: closing session
> > with lsr-id 100.92.64.68
> > Mar  3 00:58:37 mpls-gw ldpd[9902]: label decision engine exiting
> > Mar  3 00:58:37 mpls-gw ldpd[54245]: kernel routing table decoupled
> > Mar  3 00:58:37 mpls-gw ldpd[54245]: waiting for children to terminate
> > Mar  3 00:58:37 mpls-gw ldpd[54245]: ldp engine terminated; signal 10
> > Mar  3 00:58:37 mpls-gw ldpd[54245]: terminating
>
> A backtrace (preferably from a copy of ldpd built with debug symbols)
> would likely be helpful.
>
> If you don't already have the source tree checked out you can fetch
> just ldpd:
>
> $ cvs -d anon...@anoncvs.openbsd.org:/cvs get -P -r OPENBSD_6_4 
> src/usr.sbin/ldpd
> $ cd src/usr.sbin/ldpd
> $ make obj && make clean && make DEBUG=-g
>
> Enable coredumps for priv-dropped processes, as shown in sysctl(8):
>
> # mkdir -m 700 /var/crash/ldpd
> # sysctl kern.nosuidcoredump=3
>
> Then:
>
> # obj/ldpd -vvvvvd
> <wait for crash>
> # ls /var/crash/ldpd
> 12345.core
> # gdb obj/ldpd /var/crash/ldpd/12345.core
> (gdb) bt full
>
>

Reply via email to