> On Aug 19, 2016, at 10:10 AM, Richard James Salts <post...@spectralmud.org> > wrote: > > It sounds like similar behaviour to what postfix is logging, so at least you > have a way to replicate it now. Try checking netstat -antp | grep :7777 and > see what state all the tcp sockets are in. If you're seeing a lot in SYN > state it means that your python process has been too busy to process the > information from the kernel. If you're seeing a lot in TIME_WAIT it might be > that the rate of connections is too high and you're running out of > 127.0.0.1:source port -> 127.0.0.1:7777 combinations. This obviously won't > solve the problem but will give you an idea of what's happening.
On production server, my policy server was running with original Python. After discussions in this thread, i thought it was too slow to process those requests, so i now run it with pypy to (hopefully) get better performance. But unfortunately, still same issue. When I saw lots of "Connection timed out" and "Connection reset by peer", the output of repeat command "netstat -antp | grep :7777 | awk '{print $6}' | sort | uniq -c | sort -nr" are: 45 ESTABLISHED 38 SYN_SENT 12 SYN_RECV 1 LISTEN 1 FIN_WAIT2 1 CLOSE_WAIT ----------------------- 56 SYN_SENT 44 ESTABLISHED 10 SYN_RECV 5 TIME_WAIT 1 LISTEN 1 FIN_WAIT2 1 CLOSE_WAIT ------------------------ 48 SYN_SENT 44 ESTABLISHED 10 SYN_RECV 4 TIME_WAIT 1 LISTEN 1 FIN_WAIT2 1 CLOSE_WAIT ----------------------------