Re: High load average under 1.8 with multiple draining processes

Samuel Reed Mon, 15 Jan 2018 06:15:52 -0800

Thank you for the patch and your quick attention to this issue. Results
after a few reloads, 8 threads on 16 core machine, both draining and new
process have patches.


New process:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 96.24    0.432348          16     26917           epoll_wait
  3.60    0.016158          16      1023         7 recvfrom
  0.12    0.000524          31        17           sendto
  0.04    0.000190           0      1126         3 write
  0.01    0.000036           1        70        23 read
  0.00    0.000000           0        21           close
  0.00    0.000000           0         7           socket
  0.00    0.000000           0         7         7 connect
  0.00    0.000000           0        13           sendmsg
  0.00    0.000000           0        17           setsockopt
  0.00    0.000000           0         7           fcntl
  0.00    0.000000           0         9           epoll_ctl
  0.00    0.000000           0         5         5 accept4
------ ----------- ----------- --------- --------- ----------------
100.00    0.449256                 29239        45 total


Draining process:

% time     seconds  usecs/call     calls    errors syscall
------ ----------- ----------- --------- --------- ----------------
 78.26    0.379045          16     23424           epoll_wait
 13.94    0.067539           7      9877         4 recvfrom
  7.80    0.037764           4     10471         6 write
  0.00    0.000007           0        29        10 read
  0.00    0.000000           0         9           close
  0.00    0.000000           0         5           sendto
  0.00    0.000000           0         3           shutdown
  0.00    0.000000           0        20           epoll_ctl
------ ----------- ----------- --------- --------- ----------------
100.00    0.484355                 43838        20 total


I ran this a few times while both processes were live and the numbers
weren't significantly different. The new process still has a remarkably
high proportion of epoll_wait.


On 1/15/18 7:48 AM, Christopher Faulet wrote:
> Le 12/01/2018 à 18:51, Willy Tarreau a écrit :
>> On Fri, Jan 12, 2018 at 11:06:32AM -0600, Samuel Reed wrote:
>>> On 1.8-git, similar results on the new process:
>>>
>>> % time     seconds  usecs/call     calls    errors syscall
>>> ------ ----------- ----------- --------- --------- ----------------
>>>   93.75    0.265450          15     17805           epoll_wait
>>>    4.85    0.013730          49       283           write
>>>    1.40    0.003960          15       266        12 recvfrom
>>>    0.01    0.000018           0        42        12 read
>>>    0.00    0.000000           0        28           close
>>>    0.00    0.000000           0        12           socket
>>>    0.00    0.000000           0        12        12 connect
>>>    0.00    0.000000           0        19         1 sendto
>>>    0.00    0.000000           0        12           sendmsg
>>>    0.00    0.000000           0         6           shutdown
>>>    0.00    0.000000           0        35           setsockopt
>>>    0.00    0.000000           0         7           getsockopt
>>>    0.00    0.000000           0        12           fcntl
>>>    0.00    0.000000           0        13           epoll_ctl
>>>    0.00    0.000000           0         2         2 accept4
>>> ------ ----------- ----------- --------- --------- ----------------
>>> 100.00    0.283158                 18554        39 total
>>>
>>> Cursory look through the strace output looks the same, with the same
>>> three types as in the last email, including the cascade.
>>
>> OK thank you for testing. On Monday we'll study this with Christopher.
>>
>
> Hi Samuel,
>
> Here are 2 patches that may solve your problem. Idea is to set the
> poller timeout to 0 for a specific thread only when some processing
> are expected for this thread. The job was already done for tasks and
> applets using bitfields. Now, we do the same for the FDs.
>
> For now, we don't know if it aims your problem, but it should avoid a
> thread to loop for nothing. Could you check if it works please ?
>

Re: High load average under 1.8 with multiple draining processes

Reply via email to