On Sat, May 9, 2020 at 8:32 AM Cholerae Hu <cholerae...@gmail.com> wrote: > > I'm maintaining a highly-loaded proxy-like service, which serves huge amount > of small rpc requests every day. Yesterday I profiled it, and found that > runtime.netpoll took 8.5% cpu(runtime.mcall took 20% cpu). > > There is only one global epoll fd in runtime, but every P will call netpoll. > Inside kernel, a fd list, a rbtree and a lock will be associated to one epoll > fd, so concurrent netpoll calls from many Ps may result in lock contention > and low cache locality I guess. > > Can we do the same optimization of timer to netpoller, to make epoll fd per > P, let each P polls on its own epoll fd first and steals ready fds from other > Ps if it has no work to do?
If epoll contention really is a problem, then I think it would be simpler to avoid contention in the runtime package by calling netpoll less often. While we could theoretically have a different epoll FD per P, I think the stealing requirements would be painful to implement. In any case the first step is to prove whether kernel contention on the epoll descriptor is a problem. Ian -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAOyqgcUW3rM_0SvxQmkDfKNCSGWUVkNFti7iKHHmzRf%3DiTb%2B%2BA%40mail.gmail.com.