From: Jakub Kicinski <jakub.kicin...@netronome.com> Date: Fri, 20 Jul 2018 21:14:39 -0700
> After device is stopped we reset the rings by moving all free buffers > to positions [0, cnt - 2], and clear the position cnt - 1 in the ring. > We then proceed to clear the read/write pointers. This means that if > we try to reset the ring again the code will assume that the next to > fill buffer is at position 0 and swap it with cnt - 1. Since we > previously cleared position cnt - 1 it will lead to leaking the first > buffer and leaving ring in a bad state. > > This scenario can only happen if FW communication fails, in which case > the ring will never be used again, so the fact it's in a bad state will > not be noticed. Buffer leak is the only problem. Don't try to move > buffers in the ring if the read/write pointers indicate the ring was > never used or have already been reset. > > nfp_net_clear_config_and_disable() is now fully idempotent. > > Found by code inspection, FW communication failures are very rare, > and reconfiguring a live device is not common either, so it's unlikely > anyone has ever noticed the leak. > > Signed-off-by: Jakub Kicinski <jakub.kicin...@netronome.com> > Reviewed-by: Dirk van der Merwe <dirk.vanderme...@netronome.com> Applied. > This is arguably net material but IMHO the risk of me missing something > this could break is higher than the error actually occurring, and a > page leak on a FW communication error doesn't seem like it's worth > it at -rc6 time.. I'm happy to respin if I'm wrong! Agreed, net-next is more appropriate for this. Thanks.