Hi Sean,
> Meanwhile, I've managed to record a tcpdump of such a failed session.
> What exactly am I looking for there?
I remember a possibly similar situation back in 2008... the culprit was
a not-fully-up-to-date Cisco ASA firewall that corrupted TCP SACK fields
and hence had the remote site send RSET.
Anyways on our end the connection seemed to starve, just as you describe
it.
We detected that by comparing tcpdumps from both affected ends. Of
course we had been lucky enough to have that happen with a business
partner with competent IT people who we got a hold of, spotted the
problem and also temporarily switched the feature off on their side to
prove that this actually is the problem.
A firmware upgrade on my client's firewall then fixed the issue.
With a server hosted somewhere and incoming connections from big
clusters, you might not be as lucky as that...
best regards,
-hannes