Hi Wietse,

Am 11.09.2014 um 13:49 schrieb Wietse Venema:
> What is the distribution of DATA sizes before failure? In your
> example I see numbers around 3kB, 9kB, 12kB.

At the moment, I see these sizes:

- always exactly 17511 bytes from smtp-out-127-*.amazon.com (today, seems to be 
only 3 different hosts trying)
- always exactly 49116 bytes from *.psi.cust-cluster.com (I've seen about 60 
different hosts from there today)
- always exactly 33290 bytes from mail18-*.srv2.de (about a dozen different 
hosts)

It seems those are always the same 3 messages being re-tried constantly (when I 
look at them in the incoming queue folder, it's the same recipient and sender 
and the same message-ID, as far as I can tell). I have problems only with 
messages from these clusters, everything else seems unaffected (at least I 
haven't seen any "lost connection" messages from any other hosts as far as my 
logfiles go back).

Yesterday I had an additional message with exactly 17441 bytes on every try 
before failure from the Amazon-cluster. That one was finally delivered 
completely early this morning, and has since disappeared from the cycle.

FWIW, I have received a handful of messages from the Amazon-cluster that did 
not have any delays/problems yesterday and today, one of them even from one of 
the "problematic" hosts that can't deliver the other message.

> Some failures are triggered by packet content, and may be replaced
> only by replacing hardware that operates marginally. Does the problem
> go away when you
> 
> - Replace the server (either the network card or the whole box)
> 
> - Replace the cable that connects the server to the network switch
> 
> - Replace the network switch that the server is plugged into.
> 
> - Replace the cable that connects the switch to the router
> 
> - Replace the router
> 
> - And so on...
> 
> If you think this is a stupid idea, then you haven't been around
> long enough.

By no means do I think that's stupid. :)
I'm only doing this server stuff "for fun" in my spare time, but my real job is 
in microelectronics and hardware, so I've had my share of mysterious and 
seemingly unexplainable stuff (ISI, crosstalk, low-frequency jitter, ground 
bounce, ESD-induced phenomena, you know the drill...).

Problem is that this box is a rented root server in a data center somwhere, so 
I don't have access to the hardware to try any of that. I can contact support, 
but they of course charge you for everything they do, and as long as I haven't 
ruled out that the reason is just some stupid configuration mistake on my part 
(or a routing/filtering issue at my hosting provider, or Amazon, or...), I 
don't want to start replacing hardware, obviously...

Regards,
Sean

Reply via email to