On 9 May 2018 at 14:06, Joe Stringer <j...@wand.net.nz> wrote: > This series proposes a new helper for the BPF API which allows BPF programs to > perform lookups for sockets in a network namespace. This would allow programs > to determine early on in processing whether the stack is expecting to receive > the packet, and perform some action (eg drop, forward somewhere) based on this > information. > > The series is structured roughly into: > * Misc refactor > * Add the socket pointer type > * Add reference tracking to ensure that socket references are freed > * Extend the BPF API to add sk_lookup() / sk_release() functions > * Add tests/documentation > > The helper proposed in this series includes a parameter for a tuple which must > be filled in by the caller to determine the socket to look up. The simplest > case would be filling with the contents of the packet, ie mapping the packet's > 5-tuple into the parameter. In common cases, it may alternatively be useful to > reverse the direction of the tuple and perform a lookup, to find the socket > that initiates this connection; and if the BPF program ever performs a form of > IP address translation, it may further be useful to be able to look up > arbitrary tuples that are not based upon the packet, but instead based on > state > held in BPF maps or hardcoded in the BPF program. > > Currently, access into the socket's fields are limited to those which are > otherwise already accessible, and are restricted to read-only access. > > A few open points: > * Currently, the lookup interface only returns either a valid socket or a NULL > pointer. This means that if there is any kind of issue with the tuple, such > as it provides an unsupported protocol number, or the socket can't be found, > then we are unable to differentiate these cases from one another. One > natural > approach to improve this could be to return an ERR_PTR from the > bpf_sk_lookup() helper. This would be more complicated but maybe it's > worthwhile.
This suggestion would add a lot of complexity, and there's not many legitimately different error cases. There's: * Unsupported socket type * Cannot find netns * Tuple argument is the wrong size * Can't find socket If we split the helpers into protocol-specific types, the first one would be addressed. The last one is addressed by returning NULL. It seems like a reasonable compromise to me to return NULL also in the middle two cases as well, and rely on the BPF writer to provide valid arguments. > * No ordering is defined between sockets. If the tuple could find multiple > sockets, then it will arbitrarily return one. It is up to the caller to > handle this. If we wish to handle this more reliably in future, we could > encode an ordering preference in the flags field. Doesn't need to be addressed with this series, there is scope for addressing these cases when the use case arises. > * Currently this helper is only defined for TC hook point, but it should also > be valid at XDP and perhaps some other hooks. Easy to add support for XDP on demand, initial implementation doesn't need it.