Neil,

Thanks for the quick suggestion, the hang seems to happen even with the 
zpool set failmode=continue <pool> option.

Any other way to recover from the hang ?

thanks and regards,
Karthik

 On 10/15/08 22:03, Neil Perrin wrote:
> Karthik,
>
> The pool failmode property as implemented governs the behaviour when all
> the devices needed are unavailable. The default behaviour is to wait
> (block) until the IO can continue - perhaps by re-enabling the device(s).
> The behaviour you expected can be achieved by "zpool set 
> failmode=continue <pool>",
> as shown in the link you indicated below.
>
> Neil.
>
> On 10/15/08 22:38, Karthik Krishnamoorthy wrote:
>> Hello All,
>>
>>   Summary:
>>   ~~~~~~~~
>>   cp command for mirrored zfs hung when all the disks in the mirrored
>>   pool were unavailable.
>>     Detailed description:
>>   ~~~~~~~~~~~~~~~~~~~~~
>>     The cp command (copy a 1GB file from nfs to zfs) hung when all 
>> the disks
>>   in the mirrored pool (both c1t0d9 and c2t0d9) were removed physically.
>>            NAME        STATE     READ WRITE CKSUM
>>          test        ONLINE      0     0     0
>>            mirror    ONLINE      0     0     0
>>              c1t0d9  ONLINE      0     0     0
>>              c2t0d9  ONLINE      0     0     0
>>     We think if all the disks in the pool are unavailable, cp command 
>> should
>>   fail with error (not cause hang).
>>     Our request:
>>   ~~~~~~~~~~~~
>>   Please investigate the root cause of this issue.
>>  
>>   How to reproduce:
>>   ~~~~~~~~~~~~~~~~~
>>   1. create a zfs mirrored pool
>>   2. execute cp command from somewhere to the zfs mirrored pool.
>>   3. remove the both of disks physically during cp command working
>>     =  hang happen (cp command never return and we can't kill cp 
>> command)
>>
>> One engineer pointed me to this page  
>> http://opensolaris.org/os/community/arc/caselog/2007/567/onepager/ 
>> and indicated that if all the mirrors are removed zfs enters a hang 
>> like state to prevent the kernel from going into a panic mode and 
>> this type of feature would be an RFE.
>>
>> My questions are
>>
>> Are there any documentation of the "mirror" configuration of zfs that 
>> explains what happens when the underlying
>> drivers detect problems in one of the mirror devices?
>>
>> It seems that the traditional views of "mirror" or "raid-2" would 
>> expect that the
>> mirror would be able to proceed without interruption and that does 
>> not seem to be this case in ZFS.
>> What is the purpose of the mirror, in zfs?  Is it more like an instant
>> backup?  If so, what can the user do to recover, when there is an
>> IO error on one of the devices?
>>
>>
>> Appreciate any pointers and help,
>>
>> Thanks and regards,
>> Karthik
>> _______________________________________________
>> zfs-discuss mailing list
>> zfs-discuss@opensolaris.org
>> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to