From: Jakub Kicinski <jakub.kicin...@netronome.com>
Date: Fri, 20 Jul 2018 21:14:39 -0700

> After device is stopped we reset the rings by moving all free buffers
> to positions [0, cnt - 2], and clear the position cnt - 1 in the ring.
> We then proceed to clear the read/write pointers.  This means that if
> we try to reset the ring again the code will assume that the next to
> fill buffer is at position 0 and swap it with cnt - 1.  Since we
> previously cleared position cnt - 1 it will lead to leaking the first
> buffer and leaving ring in a bad state.
> 
> This scenario can only happen if FW communication fails, in which case
> the ring will never be used again, so the fact it's in a bad state will
> not be noticed.  Buffer leak is the only problem.  Don't try to move
> buffers in the ring if the read/write pointers indicate the ring was
> never used or have already been reset.
> 
> nfp_net_clear_config_and_disable() is now fully idempotent.
> 
> Found by code inspection, FW communication failures are very rare,
> and reconfiguring a live device is not common either, so it's unlikely
> anyone has ever noticed the leak.
> 
> Signed-off-by: Jakub Kicinski <jakub.kicin...@netronome.com>
> Reviewed-by: Dirk van der Merwe <dirk.vanderme...@netronome.com>

Applied.

> This is arguably net material but IMHO the risk of me missing something
> this could break is higher than the error actually occurring, and a
> page leak on a FW communication error doesn't seem like it's worth
> it at -rc6 time..  I'm happy to respin if I'm wrong!

Agreed, net-next is more appropriate for this.

Thanks.

Reply via email to