Sean Durkin: > Hi Wietse, > > Am 11.09.2014 um 13:49 schrieb Wietse Venema: > > What is the distribution of DATA sizes before failure? In your > > example I see numbers around 3kB, 9kB, 12kB. > > At the moment, I see these sizes: > > - always exactly 17511 bytes from smtp-out-127-*.amazon.com (today, seems to > be only 3 different hosts trying) > - always exactly 49116 bytes from *.psi.cust-cluster.com (I've seen about 60 > different hosts from there today) > - always exactly 33290 bytes from mail18-*.srv2.de (about a dozen different > hosts) > > It seems those are always the same 3 messages being re-tried > constantly (when I look at them in the incoming queue folder, it's > the same recipient and sender and the same message-ID, as far as > I can tell). I have problems only with messages from these clusters, > everything else seems unaffected (at least I haven't seen any "lost > connection" messages from any other hosts as far as my logfiles > go back). > > Yesterday I had an additional message with exactly 17441 bytes on > every try before failure from the Amazon-cluster. That one was > finally delivered completely early this morning, and has since > disappeared from the cycle.
That increases my suspicion of a data-dependent error - some marginal cable/switch/router, perhaps some middle box with a memory bit error that requires a power cycle to clear the problem. If the problem is caused by crosstalk defect, then only physical replacement will solve it. > Problem is that this box is a rented root server in a data center > somwhere, so I don't have access to the hardware to try any of > that. I can contact support, but they of course charge you for > everything they do, and as long as I haven't ruled out that the > reason is just some stupid configuration mistake on my part (or a > routing/filtering issue at my hosting provider, or Amazon, or...), > I don't want to start replacing hardware, obviously... Try power cycling. Wietse