On 17 Aug 2016, at 11:34, Zhang Huangbin wrote:

Dear all,

I got a problem with my own Postfix policy server (written in Python). Postfix usually works fine with it, but sometimes it raised error like this:

Aug 17 08:32:52 mail1 postfix/smtpd[24298]: warning: problem talking to server 127.0.0.1:1234: Connection reset by peer

"Connection reset by peer" means exactly what it says. That's your policy server unilaterally closing an existing connection from a Postfix smtpd process with PID 24298. It is also possible that your policy server died in some unexpected way and the OS closed the connection.

Aug 17 08:34:05 mail1 postfix/smtpd[24771]: warning: problem talking to server 127.0.0.1:1234: Connection timed out

That's a new Postfix smtpd process with PID 24771 trying and failing to open a new connection to the policy service. How long it will wait for the connection to succeed is controlled by smtpd_policy_service_timeout. Your config shows that to be 1000s, so that smtpd process tried to open that connection 16:40 earlier, i.e. 08:17:25.

Then time Postfix raised these errors, my policy server is still working and properly processing requests (checked its log file).

If the policy server did not log something at or shortly before 'Aug 17 08:32:52' and possibly even before 08:17:25 then it needs to do more extensive logging. The Postfix smtpd process with PID 24298 had an open connection to 127.0.0.1:1234 that was reset at that time. On linux it is possible for the "OOM Killer" facility to kill a process under extreme memory pressure and to configure the system to respawn a daemon, so it can *look* like a daemon stays up because it has been respawned shortly after being killed. Of course the PID of the policy server would change in that sort of circumstance.

I don't know how to reproduce this issue, except wait (especially when server is busy, but randomly). Do you have any idea/hint about how i can debug this issue? either Postfix side or my policy server side, or both.

Make the policy server log its actions in detail, at least until you can figure this out.

Reply via email to