> > > 2. Why there is no socket option for sysctl.net.busy_poll? Clearly > > > sysctl_net_busy_poll is global and SO_BUSY_POLL only works for > > > sysctl.net.busy_read. > > > > I guess because of how sock_poll works. In that case it is not needed. > > The poll duration applies more to the pollset than any of the > > individual sockets, too. > > > Good point, it's probably like struct eventpoll vs. struct epitem. > > The reason why I am looking for a per-socket tuning is to minimize > the impact of setting busy_poll. I don't know if it is possible to somehow > make this per-socket via epoll interfaces, perhaps fundamentally > it is impossible?
I think it may be possible. The way busy_read and busy_poll work in sock_poll is that the sum of all (per socket tunable) busy_read durations on the sockets in the pollset is ~bound by (global) busy_poll. The epoll implementation is restricted in the sense that it polls only on one napi_id at a time. Alongside setting ep->napi_id in ep_set_busy_poll_napi_id, we could also set a new ep field takes the min of the global busy_poll and sk->sk_ll_usec. Though I guess you want to be able to poll on a given pollset without setting the global sysctl_net_busy_poll at all? That would be a useful feature both for epoll and poll/select. But definitely requires refining net_busy_loop_on() to optionally take some state derived from the sockets in the (e)pollset.