Re: [zfs-discuss] hung pool on iscsi

Jacob Ritorto Fri, 20 Nov 2009 11:17:15 -0800

Hi,

Can anyone identify whether this is a known issue (perhaps 6667208) andif the fix is going to be pushed out to Solaris 10 anytime soon? I'mgetting badly beaten up over this weekly, essentially anytime we drop apacket between our twenty-odd iscsi-backed zones and the filer.

Chris was kind enough to provide his synopsis here (thanks Chris):http://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSFailmodeProblem

Also, I really need a workaround for the meantime. Is someone outthere handy enough with the undocumented stuff to recommend a zdbcommand or something that will pound the delinquent pool into submissionwithout crashing everything? Surely there's a pool hard-reset commandsomewhere for the QA guys, right?


thx
jake


Chris Siebenmann wrote:

You write:
| Now I'd asked about this some months ago, but didn't get an answer so| forgive me for asking again: What's the difference between wait and| continue in my scenario? Will this allow the one faulted pool to fully| fail and accept that it's broken, thereby allowing me to frob the iscsi| initiator, re-import the pool and restart the zone? [...]
 Our experience here in a similar iscsi-based environment is that
neither 'wait' nor 'continue' will enable the pool to recover, and that
frequently the entire system will eventually hang in a state where no
ZFS pools can be used and the system can't even be rebooted cleanly.

 My primary testing has been on Solaris 10 update 6, and I wrote
up the results here:
   http://utcc.utoronto.ca/~cks/space/blog/solaris/ZFSFailmodeProblem

 I have recently been able to do preliminary testing on Solaris 10
update 8, and it appears to behave more or less the same.

 I wish I had better news for you.

        - cks


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] hung pool on iscsi

Reply via email to