On Tue, Mar 27, 2018 at 06:21:54PM -0700, David Ahern wrote: > I wanted to revisit how resource overload is handled for hardware offload > of FIB entries and rules. At the moment, the in-kernel fib notifier can > tell a driver about a route or rule add, replace, and delete, but the > notifier can not affect the action. Specifically, in the case of mlxsw > if a route or rule add is going to overflow the ASIC resources the only > recourse is to abort hardware offload. Aborting offload is akin to taking > down the switch as the path from data plane to the control plane simply > can not support the traffic bandwidth of the front panel ports. Further, > the current state of FIB notifiers is inconsistent with other resources > where a driver can affect a user request - e.g., enslavement of a port > into a bridge or a VRF. > > As a result of the work done over the past 3+ years, I believe we are > at a point where we can bring consistency to the stack and offloads, > and reliably allow the FIB notifiers to fail a request, pushing an error > along with a suitable error message back to the user. Rather than > aborting offload when the switch is out of resources, userspace is simply > prevented from adding more routes and has a clear indication of why.
Nice work, David. Ran various tests and didn't see any regressions. I know you already know this, but for the record, we plan to add accounting to KVD hash resources which will eventually allow us to return errors when resources are exceeded.