Hi Todd,
sorry for the delay in responding, been head down rewriting
a utility for the last few days.


Todd H. Poole wrote:
> Howdy James,
> 
> While responding to halstead's post (see below), I had to restart several
> times to complete some testing. I'm not sure if that's important to these
> commands or not, but I just wanted to put it out there anyway.
> 
>> A few commands that you could provide the output from
>> include:
>>
>>
>> (these two show any FMA-related telemetry)
>> fmadm faulty
>> fmdump -v
> 
> This is the output from both commands:
> 
> [EMAIL PROTECTED]:~# fmadm faulty
> --------------- ------------------------------------  -------------- ---------
> TIME            EVENT-ID                              MSG-ID         SEVERITY
> --------------- ------------------------------------  -------------- ---------
> Aug 27 01:07:08 0d9c30f1-b2c7-66b6-f58d-9c6bcb95392a  ZFS-8000-FD    Major
> 
> Fault class : fault.fs.zfs.vdev.io
> Description : The number of I/O errors associated with a ZFS device exceeded
>                     acceptable levels.  Refer to 
> http://sun.com/msg/ZFS-8000-FD
>              for more information.
> Response    : The device has been offlined and marked as faulted.  An attempt
>                     will be made to activate a hot spare if available.
> Impact      : Fault tolerance of the pool may be compromised.
> Action      : Run 'zpool status -x' and replace the bad device.
 >
> [EMAIL PROTECTED]:~# fmdump -v
> TIME                 UUID                                 SUNW-MSG-ID
> Aug 27 01:07:08.2040 0d9c30f1-b2c7-66b6-f58d-9c6bcb95392a ZFS-8000-FD
>  100%  fault.fs.zfs.vdev.io
> 
>        Problem in: zfs://pool=mediapool/vdev=bfaa3595c0bf719
>           Affects: zfs://pool=mediapool/vdev=bfaa3595c0bf719
>               FRU: -
>          Location: -


In other emails in this thread you've mentioned the desire to
get an email (or some sort of notification) when Problems Happen(tm)
in your system, and the FMA framework is how we achieve that
in OpenSolaris.



# fmadm config
MODULE                   VERSION STATUS  DESCRIPTION
cpumem-retire            1.1     active  CPU/Memory Retire Agent
disk-transport           1.0     active  Disk Transport Agent
eft                      1.16    active  eft diagnosis engine
fabric-xlate             1.0     active  Fabric Ereport Translater
fmd-self-diagnosis       1.0     active  Fault Manager Self-Diagnosis
io-retire                2.0     active  I/O Retire Agent
snmp-trapgen             1.0     active  SNMP Trap Generation Agent
sysevent-transport       1.0     active  SysEvent Transport Agent
syslog-msgs              1.0     active  Syslog Messaging Agent
zfs-diagnosis            1.0     active  ZFS Diagnosis Engine
zfs-retire               1.0     active  ZFS Retire Agent


You'll notice that we've got an SNMP agent there... and you
can acquire a copy of the FMA mib from the Fault Management
community pages (http://opensolaris.org/os/community/fm and
http://opensolaris.org/os/community/fm/mib/).




>> (this shows your storage controllers and what's
>> connected to them) cfgadm -lav
> 
> This is the output from cfgadm -lav
> 
> [EMAIL PROTECTED]:~# cfgadm -lav
> Ap_Id                          Receptacle   Occupant     Condition  
> Information
> When         Type         Busy     Phys_Id
> usb2/1                         empty        unconfigured ok
> unavailable  unknown      n        /devices/[EMAIL 
> PROTECTED],0/pci1458,[EMAIL PROTECTED]:1
> usb2/2                         connected    configured   ok
> Mfg: Microsoft  Product: Microsoft 3-Button Mouse with IntelliEye(TM)
> NConfigs: 1  Config: 0  <no cfg str descr>
> unavailable  usb-mouse    n        /devices/[EMAIL 
> PROTECTED],0/pci1458,[EMAIL PROTECTED]:2
> usb3/1                         empty        unconfigured ok
[snip]
> usb7/2                         empty        unconfigured ok
> unavailable  unknown      n        /devices/[EMAIL 
> PROTECTED],0/pci1458,[EMAIL PROTECTED],1:2
> 
> You'll notice that the only thing listed is my USB mouse... is that expected?

Yup. One of the artefacts of the cfgadm architecture. cfgadm(1m)
works by using plugins - usb, FC, SCSI, SATA, pci hotplug, InfiniBand...
but not IDE.

I think you also were wondering how to tell what controller
instances your disks were using in IDE mode - two basic ways
of achieving this:

/usr/bin/iostat -En

and

/usr/sbin/format

Your IDE disks will attach using the cmdk driver and show up like this:

c1d0
c1d1
c2d0
c2d1

In AHCI/SATA mode they'd show up as

c1t0d0
c1t1d0
c1t2d0
c1t3d0

or something similar, depending on how the bios and the actual
controllers sort themselves out.


>> You'll also find messages in /var/adm/messages which
>> might prove
>> useful to review.
> 
> If you really want, I can list the output from /var/adm/messages, but it
> doesn't seem to add anything new to what I've already copied and pasted.

No need - you've got them if you need them.

[snip]

>> http://docs.sun.com/app/docs/coll/40.17 (manpages)
>> http://docs.sun.com/app/docs/coll/47.23 (system admin collection)
>> http://docs.sun.com/app/docs/doc/817-2271 ZFS admin guide
>> http://docs.sun.com/app/docs/doc/819-2723 devices + filesystems guide
> 
> Oohh... Thank you. Good Links. I'm bookmarking these for future reading.
> They'll definitely be helpful if we end up choosing to deploy OpenSolaris
> + ZFS for our media servers.

There's a heap of info there, getting started with it can be
like trying to drink from a fire hose :)


Best regards,
James C. McPherson
--
Senior Kernel Software Engineer, Solaris
Sun Microsystems
http://blogs.sun.com/jmcp       http://www.jmcp.homeunix.com/blog
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to