On Mon, Jan 15, 2018 at 08:14:40AM -0600, Samuel Reed wrote: > Thank you for the patch and your quick attention to this issue. Results > after a few reloads, 8 threads on 16 core machine, both draining and new > process have patches. > > New process: > > % time seconds usecs/call calls errors syscall > ------ ----------- ----------- --------- --------- ---------------- > 96.24 0.432348 16 26917 epoll_wait > 3.60 0.016158 16 1023 7 recvfrom > 0.12 0.000524 31 17 sendto > 0.04 0.000190 0 1126 3 write > 0.01 0.000036 1 70 23 read > 0.00 0.000000 0 21 close > 0.00 0.000000 0 7 socket > 0.00 0.000000 0 7 7 connect > 0.00 0.000000 0 13 sendmsg > 0.00 0.000000 0 17 setsockopt > 0.00 0.000000 0 7 fcntl > 0.00 0.000000 0 9 epoll_ctl > 0.00 0.000000 0 5 5 accept4 > ------ ----------- ----------- --------- --------- ---------------- > 100.00 0.449256 29239 45 total > > > Draining process: > > % time seconds usecs/call calls errors syscall > ------ ----------- ----------- --------- --------- ---------------- > 78.26 0.379045 16 23424 epoll_wait > 13.94 0.067539 7 9877 4 recvfrom > 7.80 0.037764 4 10471 6 write > 0.00 0.000007 0 29 10 read > 0.00 0.000000 0 9 close > 0.00 0.000000 0 5 sendto > 0.00 0.000000 0 3 shutdown > 0.00 0.000000 0 20 epoll_ctl > ------ ----------- ----------- --------- --------- ---------------- > 100.00 0.484355 43838 20 total > > > I ran this a few times while both processes were live and the numbers > weren't significantly different. The new process still has a remarkably > high proportion of epoll_wait.
Thank you Samuel for the test. It's sad, but it may indicate something completely different. Christopher at least I'm willing to integrate your fix to rule out this corner case in the future. Among the differences possible between an old and a new process, we can enumerate very few things, for example the peers, which work differently for new and old processes. Do you use peers in your config ? It would also be possible that we pass an fd corresponding to a more or less closed listener or something like this. Do you reload with -x to pass FDs across processes ? Do you use master-worker ? Just trying to rule out a number of hypothesis. An anonymized version of your config will definitely help here I'm afraid. Thanks! Willy

