On 08/12/2016 13:56, Joe Holden wrote:
Hi guys,
I've just updated a couple of boxes to the Dec 7th snapshot and I'm
seeing some bizarre behaviour on one box, on one specific interface:
The box in question is an OSPF and BGP speaker, and the following
happens when booted:
After OSPF and BGP tables load, a couple of minutes later the following
appear:
Dec 8 06:33:03 edge-pe-2 /bsd: arp_rtrequest: bad gateway value: em0
Dec 8 06:33:03 edge-pe-2 last message repeated 2 times
Dec 8 06:33:04 edge-pe-2 /bsd: arpresolve: X.X.X.X: incorrect arp
information
Then some seconds later:
Dec 8 06:41:41 edge-pe-2 /bsd: arpresolve: unresolved and rt_expire == 0
At this point the arp entry for the neighbour in question has been
updated so that the lladdr is all zeros and the interface is simply '?'
according to arp -n.
The box it is paired with that has a pretty much identical config
doesn't exhibit the same problem and this only occurs on the single em0
interface (the box has about 6 active in total, mix of em and ix).
I should clarify that this isn't CARP, but rather the box it is directly
connected to.
OpenBSD 6.0-current (GENERIC.MP) #19: Wed Dec 7 12:07:13 MST 2016
bu...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
I don't see any odd behaviour on the wire, according to pcap the who-has
and associated reply is seen once as expected with the correct lladdr,
but at some point it gets overwritten with the above.
Previous kernel was about 2 months old which leaves a large number of
commits to check through - I can't see anything that might cause this
from a quick look though so I was hoping someone might have an idea.
For now i've had to add a static arp entry with permanent to prevent it
misbehaving but that has stopped working at least once so far.
I also have limited debug ability as the box is part of a live network
and obviously it causes disruption, and I can't recreate it in a lab
with identical configurations.
Any pointers appreciated!
Cheers