On Thu, Feb 14, 2019 at 4:39 PM Willem de Bruijn <willemdebruijn.ker...@gmail.com> wrote: > > On Thu, Feb 14, 2019 at 3:15 PM Cong Wang <xiyou.wangc...@gmail.com> wrote: > > > > Hello, > > > > While looking into the busy polling in Linux kernel, three questions > > come into my mind: > > > > 1. In the document[1], it claims sysctl.net.busy_poll depends on > > either SO_BUSY_POLL or sysctl.net.busy_read. However, from the code in > > ep_set_busy_poll_napi_id(), I don't see such a dependency. It simply > > checks sysctl_net_busy_poll and sk->sk_napi_id, but sk->sk_napi_id is > > always set as long as we enable CONFIG_NET_RX_BUSY_POLL. So what I am > > missing here? > > That documentation refers to sock_poll. This does call sk_busy_loop > individually on each socket in the pollset and thus respects those values. > Epoll was added later, after both sock_poll and that documentation.
Ah, yeah, this explains my confusion. I thought busy_poll refers to all polling related syscalls, that is select()/poll()/epoll(), it looks like epoll() is so special here. Probably we need some clarification in net.txt. > > > 2. Why there is no socket option for sysctl.net.busy_poll? Clearly > > sysctl_net_busy_poll is global and SO_BUSY_POLL only works for > > sysctl.net.busy_read. > > I guess because of how sock_poll works. In that case it is not needed. > The poll duration applies more to the pollset than any of the > individual sockets, too. Good point, it's probably like struct eventpoll vs. struct epitem. The reason why I am looking for a per-socket tuning is to minimize the impact of setting busy_poll. I don't know if it is possible to somehow make this per-socket via epoll interfaces, perhaps fundamentally it is impossible? > > > 3. How is SO_INCOMING_NAPI_ID supposed to be used? I can't find any > > useful documents online. Any example or more detailed doc? > > From the commit message of 6d4339028b35 ("net: Introduce > SO_INCOMING_NAPI_ID") it sounds like a sharding mechanism that > maintains flow affinity by sharding based on rxqueue (assuming that > something like RSS was used to ensure flow affinity in the first > place). That commit message is the only thing I can find too. I kinda need a formal documentation in man page and hopefully an example too. Thanks for your explanations!