On 03/22/21 06:46 AM, Tim Mooney via openindiana-discuss wrote:

When I boot my OI workstation (updated to 3/21/2021), I get the message

    NOTICE: One or more I/O devices have been retired

All the documentation I've found says to look at the output
from prtconf and 'fmadm faulty' to identify which device is the problem.

However, on my workstation:

# prtconf | egrep -i retire
# fmadm faulty
# fmadm faulty -a -v
# svcs -a | egrep -i fm
STATE          STIME    FMRI
disabled       20:56:19 svc:/system/fm/notify-params:default
online         20:56:49 svc:/system/fmd:default

# fmadm config
MODULE                   VERSION STATUS  DESCRIPTION
cpumem-retire            1.1     active  CPU/Memory Retire Agent
disk-lights              1.0     active  Disk Lights Agent
disk-transport           1.1     active  Disk Transport Agent
eft                      1.16    active  eft diagnosis engine
ext-event-transport      0.2     active  External FM event transport
fabric-xlate             1.0     active  Fabric Ereport Translater
fmd-self-diagnosis       1.0     active  Fault Manager Self-Diagnosis
io-retire                2.0     active  I/O Retire Agent
sensor-transport         1.1     active  Sensor Transport Agent
ses-log-transport        1.0     active  SES Log Transport Agent
software-diagnosis       0.1     active  Software Diagnosis engine
software-response        0.1     active  Software Response Agent
sysevent-transport       1.0     active  SysEvent Transport Agent
syslog-msgs              1.1     active  Syslog Messaging Agent
zfs-diagnosis            1.0     active  ZFS Diagnosis Engine
zfs-retire               1.0     active  ZFS Retire Agent

# uname -v
illumos-88a8a2ff32



Any suggestions for what I should do to identify the source of this issue?

Thanks,

Tim

Hi!

On my system the prtconf shows the retired devices,
so the commands you use seem right.

Maybe you could try the "device driver utitlity"  ddu ...

$ prtconf|grep reti
        pci8086,a114 (retired)
            pci8086,15da (retired)
                pci8086,15da (retired)
                pci8086,15da (retired)
                pci8086,15da (retired)
                    pci1028,7b1 (retired)

$ dmesg|grep reti
Mar 22 06:59:52 dell6510 genunix: [ID 888150 kern.warning] WARNING: Device not found in device tree. Skipping device unretire: /pci@0,0/pci8086,a114@1c,4/pci8086,15da@0/pci8086,15da@2/pci1028,7b1@0/storage@3/disk@0,0

$ grep reti /var/adm/messages
Mar 22 06:59:41 dell6510 genunix: [ID 751201 kern.notice] NOTICE: One or more I/O devices have been retired Mar 22 06:59:52 dell6510 genunix: [ID 888150 kern.warning] WARNING: Device not found in device tree. Skipping device unretire: /pci@0,0/pci8086,a114@1c,4/pci8086,15da@0/pci8086,15da@2/pci1028,7b1@0/storage@3/disk@0,0

$ sudo fmadm faulty
Password:
--------------- ------------------------------------ -------------- ---------
TIME            EVENT-ID MSG-ID         SEVERITY
--------------- ------------------------------------ -------------- ---------
Jan 12 17:23:25 9e4acbcd-4015-c8a6-f81e-ff479d7690cf PCIEX-8000-DJ  Major

Host        : dell6510
Platform    : Precision-7720    Chassis_id  : 49JT5M2
Product_sn  :

Fault class : fault.io.pciex.device-noresp max 18%
              fault.io.pciex.device-interr max 18%
              fault.io.pciex.bus-noresp max 9%
Affects     : dev:////pci@0,0/pci8086,a114/pci8086,15da/pci8086,15da/pci1028,7b1@0
              dev:////pci@0,0/pci8086,a114/pci8086,15da@0
              dev:////pci@0,0/pci8086,a114@1c,4
                  faulted and taken out of service
FRU         : "MB" (hc://:product-id=Precision-7720:server-id=dell6510:chassis-id=49JT5M2/motherboard=0) max 18%
                  faulty

Description : A problem has been detected on one of the specified devices or on
              one of the specified connecting buses.
              Refer to http://illumos.org/msg/PCIEX-8000-DJ for more
              information.

Response    : One or more device instances may be disabled

Impact      : Loss of services provided by the device instances associated with
              this fault

Action      : If a plug-in card is involved check for badly-seated cards or
              bent pins. Otherwise schedule a repair procedure to replace the               affected device(s).  Use fmadm faulty to identify the devices or
              contact your illumos distribution team for support.


_______________________________________________
openindiana-discuss mailing list
openindiana-discuss@openindiana.org
https://openindiana.org/mailman/listinfo/openindiana-discuss

Reply via email to