On Thu, 2019-10-03 at 19:04 +0200, Ondrej Zajicek wrote: > CAUTION: This email originated from outside of the organization. Do not click > links or open attachments unless you recognize the sender and know the > content is safe. > > > On Thu, Oct 03, 2019 at 06:36:05PM +0200, Ondrej Zajicek wrote: > > > me.dr is 0 because ifa->drip is 0, but should it have declared itself > > > as an eligible DR? > > > > Oh, you meant DR IP, not neghbor IP. I think that it is correct - the > > process should start with the node idea of DR IP, which is initially > > zero (RFC 2328 9.4. the first paragraph). > > Just reviewed the election code and it seems consistent to the algorithm > in RFC 2328. If no electable neighbor is there and me.dr / me.bdr is > initially zero, then it should elect itself as BDR in the first round > (also elect itself as DR by the ndr==NULL condition), and elect itself > as DR in the second election. > > Could you add some debug output to check the value of DR and BDR (both > ID and IP) after first and possibly second round? >
Collected some more logs when the state machine becomes stuck. It appears as the eth1 has been DR when this stuck state happens. Note that interface transitions from Down to DROther in one step and then maintains that state even after prio is changed from 0 to 5. 2019-10-04 19:34:17.443 <TRACE> ospfv2_1 < interface eth1 changes link 2019-10-04 19:34:17.443 <TRACE> ospfv2_1: Interface eth1 changed state from DR to Loopback 2019-10-04 19:34:17.443 <TRACE> ospfv2_1: Removing interface eth1 (10.210.138.68/30) from area 2.2.2.2 2019-10-04 19:34:17.443 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 changed state from Full to Down 2019-10-04 19:34:17.443 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 removed ... 2019-10-04 19:34:17.443 <TRACE> kernel1 < interface eth1 changes link 2019-10-04 19:34:17.443 <TRACE> kernel2 < interface eth1 changes link 2019-10-04 19:34:17.443 <TRACE> direct1 < interface eth1 changes link ... 2019-10-04 19:34:22.562 <TRACE> ospfv2_1: Interface eth1 changed state from Loopback to Waiting 2019-10-04 19:34:22.611 <TRACE> ospfv2_1: HELLO packet received from nbr 10.210.138.70 on eth1 2019-10-04 19:34:22.611 <TRACE> ospfv2_1: New neighbor 10.210.138.70 on eth1, IP address 10.210.138.70 2019-10-04 19:34:22.611 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 changed state from Down to Init 2019-10-04 19:34:22.611 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 changed state from Init to 2-Way ... 2019-10-04 19:34:32.564 <TRACE> ospfv2_1: HELLO packet sent via eth1 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: DBDES packet received from nbr 10.210.138.70 on eth1 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: length 32 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: router 10.210.138.70 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: mtu 1500 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: imms I M MS 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: ddseq 948166447 2019-10-04 19:34:32.570 <TRACE> ospfv2_1: DBDES packet ignored - lesser state than ExStart ... 2019-10-04 19:34:47.370 <TRACE> ospfv2_1: HELLO packet sent via eth1 2019-10-04 19:34:47.370 <TRACE> ospfv2_1: Interface eth1 changed state from Waiting to Down 2019-10-04 19:34:47.370 <TRACE> ospfv2_1: Removing interface eth1 (10.210.138.68/30) from area 2.2.2.2 2019-10-04 19:34:47.370 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 changed state from 2-Way to Down 2019-10-04 19:34:47.370 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 removed ... 2019-10-04 19:35:40.066 <TRACE> ospfv2_1: Interface eth1 changed state from Down to DROther 2019-10-04 19:35:40.066 <TRACE> ospfv2_1: HELLO packet sent via eth1 2019-10-04 19:35:40.076 <TRACE> ospfv2_1: HELLO packet received from nbr 10.210.138.70 on eth1 2019-10-04 19:35:40.076 <TRACE> ospfv2_1: New neighbor 10.210.138.70 on eth1, IP address 10.210.138.70 2019-10-04 19:35:40.076 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 changed state from Down to Init 2019-10-04 19:35:40.076 <TRACE> ospfv2_1: Neighbor 10.210.138.70 on eth1 changed state from Init to 2-Way 2019-10-04 19:35:40.109 <INFO> Reconfiguring 2019-10-04 19:35:40.109 <TRACE> ospfv2_1: Changing priority of eth1 from 0 to 5 2019-10-04 19:35:40.109 <TRACE> ospfv2_1: Reconfigured ... 2019-10-04 19:35:50.066 <TRACE> ospfv2_1: HELLO packet sent via eth1 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: DBDES packet received from nbr 10.210.138.70 on eth1 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: length 32 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: router 10.210.138.70 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: mtu 1500 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: imms I M MS 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: ddseq 948166575 2019-10-04 19:35:50.072 <TRACE> ospfv2_1: DBDES packet ignored - lesser state than ExStart Next some more logs collected with LOCAL_DEBUG. The final call of ospf_dr_election results in both DR and BDR 0. From that point, no moreelection attempts are done. ospfv2_1: Neighbor? on iface eth1 ospfv2_1: Iface eth1 can_do_adj=0 OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 10.210.138.69) Neighbor state machine for 10.210.138.70 on eth1, event HelloReceived SM on eth1. Event is 'WaitTimer' (B)DR election. 1: ndr 00000000, bdr bfbb8100 2: ndr bfbb8100, bdr 00000000 DR=10.210.138.69, BDR=0.0.0.0 Neighbor state machine for 10.210.138.70 on eth1, event AdjOK? ospfv2_1: Iface eth1 can_do_adj=1 OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 10.210.138.69) Neighbor state machine for 10.210.138.70 on eth1, event HelloReceived Neighbor state machine for 10.210.138.70 on eth1, event NegotiationDone OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 10.210.138.69) Neighbor state machine for 10.210.138.70 on eth1, event HelloReceived Neighbor state machine for 10.210.138.70 on eth1, event ExchangeDone ... OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 224.0.0.6) Neighbor state machine for 10.210.138.70 on eth1, event HelloReceived Deleting LSA (Type: 2001 Id: 10.210.138.69 Rt: 10.210.138.69) from lsrtl for neighbor 10.210.138.70 Deleting LSA (Type: 2002 Id: 10.210.138.69 Rt: 10.210.138.69) from lsrtl for neighbor 10.210.138.70 SM on eth1. Event is 'LoopInd' Neighbor state machine for 10.210.138.70 on eth1, event KillNbr SM on eth1. Event is 'NeighborChange' OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 10.210.138.69) OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 224.0.0.5) SM on eth1. Event is 'UnloopInd' SM on eth1. Event is 'InterfaceUp' OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 224.0.0.5) Allocating OSPF hash of order 6: 64 hash_entries, 0 low, 256 high Allocating OSPF hash of order 6: 64 hash_entries, 0 low, 256 high Neighbor state machine for 10.210.138.70 on eth1, event HelloReceived Neighbor state machine for 10.210.138.70 on eth1, event 2-WayReceived SM on eth1. Event is 'NeighborChange' ospfv2_1: Neighbor? on iface eth1 ospfv2_1: Iface eth1 can_do_adj=0 SM on eth1. Event is 'InterfaceDown' Neighbor state machine for 10.210.138.70 on eth1, event KillNbr SM on eth1. Event is 'NeighborChange' SM on eth1. Event is 'InterfaceUp' OSPF: RX hook called (iface eth1, src 10.210.138.70, dst 224.0.0.5) Allocating OSPF hash of order 6: 64 hash_entries, 0 low, 256 high Allocating OSPF hash of order 6: 64 hash_entries, 0 low, 256 high Neighbor state machine for 10.210.138.70 on eth1, event HelloReceived Neighbor state machine for 10.210.138.70 on eth1, event 2-WayReceived SM on eth1. Event is 'NeighborChange' (B)DR election. 1: ndr 00000000, bdr 00000000 2: ndr 00000000, bdr 00000000 DR=0.0.0.0, BDR=0.0.0.0 ospfv2_1: Iface eth1 can_do_adj=0 > -- > Elen sila lumenn' omentielvo > > Ondrej 'Santiago' Zajicek (email: santi...@crfreenet.org) > OpenPGP encrypted e-mails preferred (KeyID 0x11DEADC3, > wwwkeys.pgp.net) > "To err is human -- to blame it on a computer is even more so."