Ilpo Järvinen wrote:
On Thu, 12 Jul 2007, Rick Jones wrote:


One question is why the RTO gets so large that it limits failover?

If Linux TCP is working correctly,  RTO should be srtt + 2*rttvar

So either there is a huge srtt or variance, or something is going
wrong with RTT estimation.  Given some reasonable maximums of
Srtt = 500ms and rttvar = 250ms, that would cause RTO to be 1second.

I suspect that what is happening here is that a link goes down in a trunk
somewhere for some number of seconds, resulting in a given TCP segment being
retransmitted several times, with the doubling of the RTO each time.


But that's a back-off for the retransmissions, the doubling is temporary... Once you return to normal conditions, the accumulated backoff multiplier will be immediately cut back to normal. So you should then be back to 1 second (like in the example or whatever) again...

Fine, but so? I suspect the point of the patch is to provide a lower cap on the accumulated backoff so data starts flowing over the connection within that lower cap once the link is restored/failed-over.

rick jones

-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to