From: Christian Brauner <christ...@brauner.io> Date: Mon, 8 Oct 2018 13:04:13 +0200
> On Sun, Oct 07, 2018 at 08:16:21PM -0700, David Ahern wrote: >> From: David Ahern <dsah...@gmail.com> >> >> There are many use cases where a user wants to influence what is >> returned in a dump for some rtnetlink command: one is wanting data >> for a different namespace than the one the request is received and >> another is limiting the amount of data returned in the dump to a >> specific set of interest to userspace, reducing the cpu overhead of >> both kernel and userspace. Unfortunately, the kernel has historically >> not been strict with checking for the proper header or checking the >> values passed in the header. This lenient implementation has allowed >> iproute2 and other packages to pass any struct or data in the dump >> request as long as the family is the first byte. For example, ifinfomsg >> struct is used by iproute2 for all generic dump requests - links, >> addresses, routes and rules when it is really only valid for link >> requests. >> >> There is 1 is example where the kernel deals with the wrong struct: link >> dumps after VF support was added. Older iproute2 was sending rtgenmsg as >> the header instead of ifinfomsg so a patch was added to try and detect >> old userspace vs new: >> e5eca6d41f53 ("rtnetlink: fix userspace API breakage for iproute2 < v3.9.0") >> >> The latest example is Christian's patch set wanting to return addresses for >> a target namespace. It guesses the header struct is an ifaddrmsg and if it >> guesses wrong a netlink warning is generated in the kernel log on every >> address dump which is unacceptable. >> >> Another example where the kernel is a bit lenient is route dumps: iproute2 >> can send either a request with either ifinfomsg or a rtmsg as the header >> struct, yet the kernel always treats the header as an rtmsg (see >> inet_dump_fib and rtm_flags check). The header inconsistency impacts the >> ability to add kernel side filters for route dumps - a necessary feature >> for scale setups with 100k+ routes. >> >> How to resolve the problem of not breaking old userspace yet be able to >> move forward with new features such as kernel side filtering which are >> crucial for efficient operation at high scale? >> >> This patch set addresses the problem by adding a new socket flag, >> NETLINK_DUMP_STRICT_CHK, that userspace can use with setsockopt to >> request strict checking of headers and attributes on dump requests and >> hence unlock the ability to use kernel side filters as they are added. ... > At this point it's all nits so it's got my ACK but keener eyes than mine > might see other issues. > > Acked-by: Christian Brauner <christ...@brauner.io> Series applied, thanks everyone. Please be on the lookout for userspace regressions from this patch set. Thanks.