Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

Richard Elling Fri, 29 Aug 2008 17:28:04 -0700

Miles Nordin wrote:
>>>>>> "re" == Richard Elling <[EMAIL PROTECTED]> writes:
>>>>>>             
>
>     re> if you use Ethernet switches in the interconnect, you need to
>     re> disable STP on the ports used for interconnects or risk
>     re> unnecessary cluster reconfigurations.
>
> RSTP/802.1w plus setting the ports connected to Solaris as ``edge'' is
> good enough, less risky for the WAN, and pretty ubiquitously supported
> with non-EOL switches.  The network guys will know this (assuming you
> have network guys) and do something like this:
>
> sw: can you disable STP for me?
>
> net: No?
>
> sw: <jumping up and down screaming>
>
> net: um,...i mean, Why?
>
> sw: [....]
>
> net: oh, that.  Ok, try it now.
>
> sw: thanks for disabling STP for me.
>
> net: i uh,.. whatever.  No problem!
>


Precisely, this is not a problem that is usually solved unilaterally.

>     re> Can we expect a similar attention to detail for ZFS
>     re> implementers?  I'm afraid not :-(.
>
> well....you weren't really ``expecting'' it of the sun cluster
> implementers.  You just ran into it by surprise in the form of an
> Issue.  

Rather, cluster implementers tend to RTFM. I know few ZFSers who
have RTFM, and do not expect many to do so... such is life.

> so, can you expect ZFS implementers to accept that running
> ZFS, iSCSI, FC-SW might teach them something about their LAN/SAN they
> didn't already know?  

No, I expect them to see a "problem" cause by network reconfiguration
and blame ZFS.  Indeed, this is what occasionally happens with Solaris
Cluster -- but only occasionally, solving via RTFM.

> So far they seem receptive to arcane advice like
> ``make this config change in your SAN controller to let it use the
> NVRAM cache more aggressively, and stop using EMC PowerPath unless
> <blah>.''  so, Yes?
>   

I have no idea what you are trying to say here.

> I think you can also expect them to wait longer than 40 seconds before
> declaring a system is frozen and rebooting it, though.
>   

Current [s]sd driver timeouts are 60 seconds with 3-5 retries by default.
We've had those timeouts for many, many years now and do provide highly
available services on such systems.  The B_FAILFAST change did improve
the availability of systems and similar tricks have improved service 
availability
for Solaris Clusters.  Refer to Eric's post for more details of this 
minefield.

NB some bugids one should research before filing new bugs here are:
CR 4713686: sd/ssd driver should have an additional target specific timeout
http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=4713686
CR 4500536 introduces B_FAILFAST
http://bugs.opensolaris.org/view_bug.do?bug_id=4500536

> ``Let's `patiently wait' forever because we think, based on our
> uncertainty, that FSPF might take several hours to converge'' is the
> alternative that strikes me as unreasonable.
>   

AFAICT, nobody is making such a proposal.  Did I miss a post?
 -- richard

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Availability: ZFS needs to handle disk removal / driver failure better

Reply via email to