Hey again,

On 10/29/2010 07:23 PM, Wietse Venema wrote:
> The main loop in the master is as follows:
> 
> forever {
>     set an alarm for 1000s
>     do an EPOLL_WAIT for up to 500s and handle any child process
>       events, or short-term timer requests that are implemented
>       around the EPOLL_WAIT timer.
>     respond to sighup (the sighup flag is set by a signal handler)
>     respond to sigchld (the sigchld flag is set by a signal handler)
> }

Just now one machine had the issue again. I checked and saw that we
where down to just two smtpd processes and even though master was still
bound to port 25 no new connections where accepted. I did telnet to it,
but the connection was not accepted and ran into timeout.

How does the timer issue relate to the master process not accepting
anymore TCP/IP connections on port 25?


> It would be worthwhile to see what strace reports when you leave
> it running. If strace reports nothing in 500s then EPOLL_WAIT is
> not working. If strace reports nothing after 1000s then the alarm
> timer is also not working.

I'll try to gather you some strace data. I guess the strace should be of
the master? Could you give me a hint on what options you might want?


On 10/29/2010 07:04 PM, Wietse Venema wrote:
> VMware has an entire KB article on problems with delivering timer
> interrupts to guest machines, and the hoops that they are jumping
> through to avoid poor performance. See
> http://tech.groups.yahoo.com/group/postfix-users/message/269786

Thanks for the hint, I already printed that article to read over the
weekend.





Thanks for your help,


Christian









Reply via email to