Nigel Kukard wrote: >I'm unable to reproduce anything similar to this on our side, we have >some Policyd boxes in production which handle a very large number of >mails per day and maintain about 500 connections to a single MySQL >server. I've seen maybe one of those errors in Policyd. Can you maybe >shove Policyd into debug mode and see if it appears any query is getting >stuck? possibly may be an idea to tcpdump -w the traffic out and see if >there are any PSH's or repeated packet transmissions which may indicate >packet loss?
OK, loads of PSHs, but as I understand it, those are normal and the result of the application forcing any data to be sent. I still see no errors in the policyd logging. Looking through the packet capture (taken on the backend, and I suddenly realise it might have been useful to capture both ends at once), I reckon there are (using a bit of grep/sed/sort/uniq) 187 duplicated lines - ie everything except packet number and timestamp the same) from a capture of 8.4k packets. Just searching by port number from a list of duplicate packets, I did find this in the logs : 172.16.1.112 is the backend, 172.16.0.85 is one of the mail servers 8118 207.641643 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515539235 TSER=40961195 8119 207.842502 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515539286 TSER=40961195 8120 208.250402 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515539388 TSER=40961195 8121 209.066162 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515539592 TSER=40961195 8122 210.697905 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515540000 TSER=40961195 8123 213.961077 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515540816 TSER=40961195 8124 220.487679 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515542448 TSER=40961195 Which does seem a bit strange ! All the lines containing 53304 are : 5769 76.216021 172.16.0.85 -> 172.16.1.112 TCP 74 74 53304 > 10031 [SYN] Seq=0 Win=5840 Len=0 MSS=1460 SACK_PERM=1 TSV=515506372 TSER=0 WS=6 5770 76.216052 172.16.1.112 -> 172.16.0.85 TCP 74 74 10031 > 53304 [SYN, ACK] Seq=0 Ack=1 Win=5792 Len=0 MSS=1460 SACK_PERM=1 TSV=40961146 TSER=515506372 WS=6 5771 76.216310 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [ACK] Seq=1 Ack=1 Win=5888 Len=0 TSV=515506372 TSER=40961146 5772 76.216500 172.16.0.85 -> 172.16.1.112 TCP 609 609 53304 > 10031 [PSH, ACK] Seq=1 Ack=1 Win=5888 Len=543 TSV=515506372 TSER=40961146 5773 76.216518 172.16.1.112 -> 172.16.0.85 TCP 66 66 10031 > 53304 [ACK] Seq=1 Ack=544 Win=6912 Len=0 TSV=40961146 TSER=515506372 5781 76.413098 172.16.1.112 -> 172.16.0.85 TCP 80 80 10031 > 53304 [PSH, ACK] Seq=1 Ack=544 Win=6912 Len=14 TSV=40961195 TSER=515506372 5782 76.413345 172.16.1.112 -> 172.16.0.85 TCP 66 66 10031 > 53304 [FIN, ACK] Seq=15 Ack=544 Win=6912 Len=0 TSV=40961195 TSER=515506372 5783 76.413443 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [ACK] Seq=544 Ack=15 Win=5888 Len=0 TSV=515506421 TSER=40961195 5787 76.451696 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515506431 TSER=40961195 8118 207.641643 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515539235 TSER=40961195 8119 207.842502 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515539286 TSER=40961195 8120 208.250402 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515539388 TSER=40961195 8121 209.066162 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515539592 TSER=40961195 8122 210.697905 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515540000 TSER=40961195 8123 213.961077 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515540816 TSER=40961195 8124 220.487679 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515542448 TSER=40961195 8344 233.539700 172.16.0.85 -> 172.16.1.112 TCP 66 66 53304 > 10031 [FIN, ACK] Seq=544 Ack=16 Win=5888 Len=0 TSV=515545712 TSER=40961195 Now I don't know if it's relevant at all, but as I read that, the backend is pushing remaining data and then closing the connection (5781, 5782), the mail server is ACKing the packets (5783, 5787), and then 5 minutes later the mail server is timing out and closing the connection. Seems odd since it's ACK'd a FIN packet earlier. If I transfer a 1/2G file with SCP, it just streams across between the machines (either way) at about 11.2MB/s which is about right for a 100mbps network connection. -- Simon Hobson Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed author Gladys Hobson. Novels - poetry - short stories - ideal as Christmas stocking fillers. Some available as e-books. _______________________________________________ Users mailing list [email protected] http://lists.policyd.org/mailman/listinfo/users
