Thank you for the patch and your quick attention to this issue. Results after a few reloads, 8 threads on 16 core machine, both draining and new process have patches.
New process: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 96.24 0.432348 16 26917 epoll_wait 3.60 0.016158 16 1023 7 recvfrom 0.12 0.000524 31 17 sendto 0.04 0.000190 0 1126 3 write 0.01 0.000036 1 70 23 read 0.00 0.000000 0 21 close 0.00 0.000000 0 7 socket 0.00 0.000000 0 7 7 connect 0.00 0.000000 0 13 sendmsg 0.00 0.000000 0 17 setsockopt 0.00 0.000000 0 7 fcntl 0.00 0.000000 0 9 epoll_ctl 0.00 0.000000 0 5 5 accept4 ------ ----------- ----------- --------- --------- ---------------- 100.00 0.449256 29239 45 total Draining process: % time seconds usecs/call calls errors syscall ------ ----------- ----------- --------- --------- ---------------- 78.26 0.379045 16 23424 epoll_wait 13.94 0.067539 7 9877 4 recvfrom 7.80 0.037764 4 10471 6 write 0.00 0.000007 0 29 10 read 0.00 0.000000 0 9 close 0.00 0.000000 0 5 sendto 0.00 0.000000 0 3 shutdown 0.00 0.000000 0 20 epoll_ctl ------ ----------- ----------- --------- --------- ---------------- 100.00 0.484355 43838 20 total I ran this a few times while both processes were live and the numbers weren't significantly different. The new process still has a remarkably high proportion of epoll_wait. On 1/15/18 7:48 AM, Christopher Faulet wrote: > Le 12/01/2018 à 18:51, Willy Tarreau a écrit : >> On Fri, Jan 12, 2018 at 11:06:32AM -0600, Samuel Reed wrote: >>> On 1.8-git, similar results on the new process: >>> >>> % time seconds usecs/call calls errors syscall >>> ------ ----------- ----------- --------- --------- ---------------- >>> 93.75 0.265450 15 17805 epoll_wait >>> 4.85 0.013730 49 283 write >>> 1.40 0.003960 15 266 12 recvfrom >>> 0.01 0.000018 0 42 12 read >>> 0.00 0.000000 0 28 close >>> 0.00 0.000000 0 12 socket >>> 0.00 0.000000 0 12 12 connect >>> 0.00 0.000000 0 19 1 sendto >>> 0.00 0.000000 0 12 sendmsg >>> 0.00 0.000000 0 6 shutdown >>> 0.00 0.000000 0 35 setsockopt >>> 0.00 0.000000 0 7 getsockopt >>> 0.00 0.000000 0 12 fcntl >>> 0.00 0.000000 0 13 epoll_ctl >>> 0.00 0.000000 0 2 2 accept4 >>> ------ ----------- ----------- --------- --------- ---------------- >>> 100.00 0.283158 18554 39 total >>> >>> Cursory look through the strace output looks the same, with the same >>> three types as in the last email, including the cascade. >> >> OK thank you for testing. On Monday we'll study this with Christopher. >> > > Hi Samuel, > > Here are 2 patches that may solve your problem. Idea is to set the > poller timeout to 0 for a specific thread only when some processing > are expected for this thread. The job was already done for tasks and > applets using bitfields. Now, we do the same for the FDs. > > For now, we don't know if it aims your problem, but it should avoid a > thread to loop for nothing. Could you check if it works please ? >

