Jeroen van Aart wrote: > Wietse Venema wrote: >> It's unprodictive to kill off Postfix under overload. At the very >> least you should increase your 35-second deadline. > > Yes I did increase it to 120 seconds. I understand just killing and > restarting postfix is not a solution. > > As a test I switched the monitor to sending an alert and not restart. > The mail.info logs show nothing very useful. The, possibly, note > worthy things around the time postfix quit are: > > Aug 15 02:55:06 prod101 postfix/master[9402]: warning: process > /usr/lib/postfix/qmgr pid 9582 exit status 1 This is the real problem.
Search the logs just before this for a qmgr[9582] (error|fatal|panic). As Wietse noted in a past time, http://marc.info/?l=postfix-users&m=98096446416645&w=2, such processes should report before dropping status 1.