On Sun, 2008-08-31 at 12:00 -0700, Richard Elling wrote:
>     2. The algorithm *must* be computationally efficient.
>        We are looking down the tunnel at I/O systems that can
>        deliver on the order of 5 Million iops.  We really won't
>        have many (any?) spare cycles to play with.

If you pick the constants carefully (powers of two) you can do the TCP
RTT + variance estimation using only a handful of shifts, adds, and
subtracts.

> In both of these cases, the solutions imply multi-minute timeouts are
> required to maintain a stable system.  

Again, there are different uses for timeouts:
 1) how long should we wait on an ordinary request before deciding to
try "plan B" and go elsewhere (a la B_FAILFAST)
 2) how long should we wait (while trying all alternatives) before
declaring an overall failure and giving up.

The RTT estimation approach is really only suitable for the former,
where you have some alternatives available (retransmission in the case
of TCP; trying another disk in the case of mirrors, etc.,).  

when you've tried all the alternatives and nobody's responding, there's
no substitute for just retrying for a long time.

                                        - Bill


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to