Hi Tobias, On Fri, Apr 03, 2020 at 08:39:30AM +0000, Tobias Urdin wrote: > Hello, > > > We've seen a issue where if you perform a ospfctl reload and have a faulty > configuration for example a interface > > that doesn't exist it dies (which is fair in itself) but the seq num for the > database never catches up with the DR until > > the adjacency timer expires over and over again, can take up to 30 minutes > before it's back. > > > I produce a failure with a faulty interface. > > Apr 3 10:03:46 router1 ospfd[36062]: fatal in rde: rde_nbr_new: unknown > interface > Apr 3 10:03:46 router1 ospfd[19043]: ospf engine exiting > Apr 3 10:03:46 router1 ospfd[67917]: kernel routing table decoupled > Apr 3 10:03:46 router1 ospfd[67917]: terminating
Can you tell us, how this failure can be reproduced? ospfd is supposed to log that a config reload failed and carry on with it's old config. > Upon startup we then get stuck in this loop och trying to get back. > > Apr 3 10:04:15 router1 ospfd[91965]: startup > Apr 3 10:06:22 router1 ospfd[19699]: nbr_adj_timer: failed to form adjacency > with x.x.x.1 on interface vmx0 > Apr 3 10:06:42 router1 ospfd[19699]: recv_db_description: neighbor ID > x.x.x.1: invalid seq num, mine 27a9fd66 his 27a99b25 > Apr 3 10:08:22 router1 ospfd[19699]: nbr_adj_timer: failed to form adjacency > with x.x.x.1 on interface vmx0 > Apr 3 10:09:17 router1 ospfd[19699]: recv_db_description: neighbor ID > x.x.x.1: invalid seq num, mine 27aa6475 his 27a9fd69 > Apr 3 10:10:22 router1 ospfd[19699]: nbr_adj_timer: failed to form adjacency > with x.x.x.1 on interface vmx0 > Apr 3 10:11:02 router1 ospfd[19699]: recv_db_description: neighbor ID > x.x.x.1: invalid seq num, mine 27aa9109 his 27aa6476 > Apr 3 10:11:22 router1 ospfd[19699]: recv_db_description: neighbor ID > x.x.x.1: invalid seq num, mine 27aa9109 his 27aa6476 > Apr 3 10:11:27 router1 ospfd[19699]: recv_db_description: neighbor ID > x.x.x.1: invalid seq num, mine 27aa9109 his 27aa6476 > Apr 3 10:11:32 router1 ospfd[19699]: recv_db_description: neighbor ID > x.x.x.1: invalid seq num, mine 27aa9109 his 27aa6476 > Apr 3 10:11:37 router1 ospfd[19699]: recv_db_description: neighbor ID > x.x.x.1: invalid seq num, mine 27aa9109 his 27aa6476 > Apr 3 10:11:42 router1 ospfd[19699]: recv_db_description: neighbor ID > x.x.x.1: invalid seq num, mine 27aa9109 his 27aa6476 > Apr 3 10:11:47 router1 ospfd[19699]: recv_db_description: neighbor ID > x.x.x.1: invalid seq num, mine 27aa9109 his 27aa6476 > Apr 3 10:12:22 router1 ospfd[19699]: nbr_adj_timer: failed to form adjacency > with x.x.x.1 on interface vmx0 > Apr 3 10:12:51 router1 ospfd[19699]: recv_db_description: neighbor ID > x.x.x.1: invalid seq num, mine 27ab558d his 27aa910b > Apr 3 10:12:51 router1 ospfd[19699]: recv_db_description: neighbor ID > x.x.x.1: invalid seq num, mine 27ab558d his 27aa910b > > Can you share a pcap file with the OSPF packages during this situation? > It's like it cannot match the database with the DR until the > DEFAULT_ADJ_TMOUT (120sec) timeout occurs and it starts all over again. > > Anybody seen this before? Should probably note that the DR in the other end > is not a device running OpenOSPFD. What device / software version is on the other end? Thank you, Remi