Eric Schrock wrote:
> On Tue, Dec 12, 2006 at 02:08:57PM -0500, James F. Hranicky wrote:
>> Sure, but that's what I want to avoid. The FMA agent should do this by
>> itself, but it's not, so I guess I'm just wondering why, or if there's
>> a good way to get to do so. If this happens in the middle of the night I
>> don't want to have to run the commands by hand.
> 
> Yes, the FMA agent should do this.  Can you run 'fmdump -v' and see if
> the DE correctly identified the faulted devices?

Here you go:

# fmdump -v
TIME                 UUID                                 SUNW-MSG-ID
Nov 29 16:29:12.1947 e50198f2-2eb9-c58b-d7c5-87aaae5cb935 ZFS-8000-D3
  100%  fault.fs.zfs.device

        Problem in: zfs://pool=8e63f0b8e4263e71/vdev=9272c0973ecdb27c
           Affects: zfs://pool=8e63f0b8e4263e71/vdev=9272c0973ecdb27c
               FRU: -

Nov 30 10:31:48.8844 1a44a780-05c0-cb6e-d44f-f1d8999f40e5 ZFS-8000-D3
  100%  fault.fs.zfs.device

        Problem in: zfs://pool=51f1caf6cad1aa2f/vdev=769276842b0efd54
           Affects: zfs://pool=51f1caf6cad1aa2f/vdev=769276842b0efd54
               FRU: -

Dec 11 14:04:57.8803 c46d21e0-200d-43a1-e5db-ae9c9ebf3482 ZFS-8000-D3
  100%  fault.fs.zfs.device

        Problem in: zfs://pool=2646e20c1cb0a9d0/vdev=52070de44ec80c15
           Affects: zfs://pool=2646e20c1cb0a9d0/vdev=52070de44ec80c15
               FRU: -

Dec 11 14:42:32.1271 1319464e-7a8c-e65b-962e-db386e90f7f2 ZFS-8000-D3
  100%  fault.fs.zfs.device

        Problem in: zfs://pool=2646e20c1cb0a9d0/vdev=724c128cdbc17745
           Affects: zfs://pool=2646e20c1cb0a9d0/vdev=724c128cdbc17745
               FRU: -

I'm not really sure what it means.

>> For instance, the zpool command hanging or the system hanging trying to
>> reboot normally.
> 
> If the SCSI commands hang forever, then there is nothing that ZFS can
> do, as a single write will never return.  The more likely case is that
> the commands are continually timining out with very long response times,
> and ZFS will continue to talk to them forever.  The future FMA
> integration I mentioned will solve this problem.  In the meantime, you
> should be able to 'zpool offline' the affected devices by hand.

Well, as long as I know which device is affected :-> If "zpool status"
doesn't return it may be difficult to figure out.

Do you know if the SATA controllers in a Thumper can better handle this
problem?

> There is also associated work going on to better handle asynchrounous
> reponse times across devices.  Currently, a single slow device will slow
> the entire pool to a crawl.

Do you have an idea as to when this might be available?

Thanks for all your input,
Jim
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to