On Fri, Sep 28, 2018 at 08:44:57AM -0700, dsah...@kernel.org wrote: > From: David Ahern <dsah...@gmail.com> > > There are many use cases where a user wants to influence what is > returned in a dump for some rtnetlink command: one is wanting data > for a different namespace than the one the request is received and > another is limiting the amount of data returned in the dump to a > specific set of interest to userspace, reducing the cpu overhead of > both kernel and userspace. Unfortunately, the kernel has historically > not been strict with checking for the proper header or checking the > values passed in the header. This lenient implementation has allowed > iproute2 and other packages to pass any struct or data in the dump > request as long as the family is the first byte. For example, ifinfomsg > struct is used by iproute2 for all generic dump requests - links, > addresses, routes and rules when it is really only valid for link > requests. > > There is 1 is example where the kernel deals with the wrong struct: link > dumps after VF support was added. Older iproute2 was sending rtgenmsg as > the header instead of ifinfomsg so a patch was added to try and detect > old userspace vs new: > e5eca6d41f53 ("rtnetlink: fix userspace API breakage for iproute2 < v3.9.0") > > The latest example is Christian's patch set wanting to return addresses for > a target namespace. It guesses the header struct is an ifaddrmsg and if it > guesses wrong a netlink warning is generated in the kernel log on every > address dump which is unacceptable. > > Another example where the kernel is a bit lenient is route dumps: iproute2 > can send either a request with either ifinfomsg or a rtmsg as the header > struct, yet the kernel always treats the header as an rtmsg (see > inet_dump_fib and rtm_flags check). > > How to resolve the problem of not breaking old userspace yet be able to > move forward with new features such as kernel side filtering which are > crucial for efficient operation at high scale? > > This patch set addresses the problem by adding a new netlink flag, > NLM_F_DUMP_PROPER_HDR, that userspace can set to say "I have a clue, and > I am sending the right header struct" and that the struct fields and any > attributes after it should be used for filtering the data returned in the > dump. > > Kernel side, the dump handlers are updated to check every field in the > header struct and all attributes passed. Only ones where filtering is > implemented are allowed to be set. Any other values cause the dump to fail > with EINVAL. If the new flag is honored by the kernel and the dump contents > adjusted by any data passed in the request, the dump handler sets the > NLM_F_DUMP_FILTERED flag in the netlink message header. > > This is an RFC set with the address handlers updated. If the approach is > acceptable, then I will do the same to the other rtnetlink dump handlers.
I like the idea and I think this might be a good solution to this problem. If we can agree on this in favor of mine I'm all for it! Thanks, David! Christian > > > David Ahern (5): > net/netlink: Pass extack to dump callbacks > net/ipv6: Refactor address dump to push inet6_fill_args to > in6_dump_addrs > netlink: introduce NLM_F_DUMP_PROPER_HDR flag > net/ipv4: Update inet_dump_ifaddr to support NLM_F_DUMP_PROPER_HDR > net/ipv6: Update inet6_dump_addr to support NLM_F_DUMP_PROPER_HDR > > include/linux/netlink.h | 2 + > include/uapi/linux/netlink.h | 1 + > net/core/rtnetlink.c | 1 + > net/ipv4/devinet.c | 52 +++++++++++++++++----- > net/ipv6/addrconf.c | 101 > +++++++++++++++++++++++++++++-------------- > net/netlink/af_netlink.c | 1 + > 6 files changed, 114 insertions(+), 44 deletions(-) > > -- > 2.11.0 >