Hi Andi,

Andi Kleen <[EMAIL PROTECTED]> writes:
> > Please note first that I want to address physical failures by
> > the failover-capable network devices, which are increasingly
> > becoming important as Xen-based VM systems are getting popular.
> > Reducing a single-point-of-failure (physical device) is vital on
> > such VM systems.
> 
> Just you typically still have lots of other single points of failures in 
> a single system, some of them quite less reliable than your typical
> NIC. But at least it gives impressive demos when pulling ethernet cables @)

Indeed :-)


> > If TCP retransmission misses the time frame between event #1 and
> > #3 in Background above (between 20 and 30sec since network
> > failure), a failure causes the system-level failover where the
> > network-device-level failover should be enough.
> 
> You should probably make sure that the device ends up returning the
> right NET_XMIT_* code for such drops to TCP, in particular
> NET_XMIT_DROP. This might require slight driver interface
> changes. Also right now it only affects the congestion window, I think, 
> it  might be reasonable to let it affect the timer backoff too.

Well, I don't think it can be a help.

Your suggestion, to utilize NET_XMIT_* code returned from an
underlying layer, is done in tcp_transmit_skb.

But my problem is that tcp_transmit_skb is not called during a
certain period of time.  So I'm suggesting to cap RTO value so
that tcp_transmit_skb gets called more frequently.

Does it make sense, Andi?

Regards,

-- 
OBATA Noboru ([EMAIL PROTECTED])
-
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to