We have encountered an issue resulting from commit 2724680bceee ("neigh: Keep 
neighbour cache entries if number of them is small enough."), which allows 
stale entries to remain in the neigh table indefinitely if the total number of 
entries is less than gc_thresh1.

This issue arises if:
- a stale entry has existed for a long time, so it has a sufficiently old 
neigh->confirmed value
- the neighbour itself has sinced change MAC address
- we then try to ping the neighbour

When we ping the neighbour, the entry moves into NUD_DELAY as expected. But 
then, within neigh_timer_handler(), an incorrect jiffie comparison causes 
time_before_eq(now, neigh->confirmed + NEIGH_VAR(neigh->parms, 
DELAY_PROBE_TIME)) to return true and the entry is erroneously moved to 
NUD_REACHABLE. The entry becomes stuck in this state, even though it is not 
actually reachable as the neighbour has since changed MAC address.

The necessary age of neigh->confirmed for this to occur depends on the 
platform. It occurs after approximitely 100 days on a 32-bit platform with 
250HZ.

We have resolved this by setting gc_thresh1 = 0, which effectively undoes 
commit 2724680bceee.

I would like to know if anyone else has observed this or has an alternative 
solution.

Kind regards,
Ash

Reply via email to