On 2020-04-13, Claudio Jeker <cje...@diehard.n-r-g.com> wrote:
> On Mon, Apr 13, 2020 at 02:08:31PM +0200, Remi Locherer wrote:
>> On Mon, Apr 13, 2020 at 12:05:10PM +0100, Richard Chivers wrote:
>> > On Mon, 13 Apr 2020, 10:18 Remi Locherer, <remi.loche...@relo.ch> wrote:
>> > >
>> > > On Mon, Apr 13, 2020 at 08:38:31AM +0100, Richard Chivers wrote:
>> > > > We have been having a strange issue, whereby OSPF stops updating
>> > > properly.
>> > > >
>> > > > We can see an entry for an ip route in the database but it is not in 
>> > > > the
>> > > > kernel routing table, and when it is the DR, other routers then do not
>> > > have
>> > > > the route at all.
>> > > >
>> > > > We are seeing this across multiple boxes. We have 10+ ospf speakers, 
>> > > > and
>> > > > seem to see the issue at different times.
>> > > >
>> > > > The problem starts with:
>> > > >
>> > > > ospfd[6960]: recv_db_description: neighbor ID x.x.x.x: seq num 
>> > > > mismatch,
>> > > > bad flags
>> > >
>> > > The neighbor sent a db desc with the master flag set differently than 
>> > > what
>> > > this ospfd instance recorded before for that particular neighbor.
>> > >
>> > > See 2nd last item on page 100 of RFC 2328:
>> > > https://tools.ietf.org/html/rfc2328#page-100
>> > 
>> > 
>> > Thanks, should the routers just recover then from this scenario even if it
>> > was happening due to lost packets, CPU pause etc.
>> 
>> I think so. But it may take quite a while. It might also be an bug in ospfd
>> or in another implementation.

On my 6.6/current boxes it seems to recover fairly quickly from this (30
seconds or so). I've definitely seen it take a long time in the past though.

> Since this issues happen with 5.8 and 6.4 ospfd I would suggest to update
> to at least 6.6 (especially the 5.8). IIRC there was some issue with ospfd
> neighbor selection that caused troubles when sessions flapped. This was
> fixed some time ago but I doubt 5.8 has that fix in.

That one was fixed in 6.3.

If you also run bgpd then be aware there are crashes with the version in 
6.6 release - fixed in syspatches (and of course in snapshots), but one 
of the crashes is at startup with some configurations and it's hard to 
run syspatch if you have no routing ;) so either be ready to cope with
that in case you run into it (e.g. pre-download the syspatch directory
and make sure you have console access), or consider skipping 6.6 (go
straight to a -current snapshot).


Reply via email to