The OS booted up and the SAS controller was now detected and supported by
the mpt(4) driver:
---
mpt0: <LSILogic SAS Adapter> port 0xec00-0xecff mem 0xfc4fc000-0xfc4fffff,
0xfc4e0000-0xfc4effff irq 64 at device 8.0 on pci2
mpt0: Reserved 0x100 bytes for rid 0x10 type 4 at 0xec00
mpt0: Reserved 0x4000 bytes for rid 0x14 type 3 at 0xfc4fc000
mpt0: [GIANT-LOCKED]
mpt0: MPI Version=1.5.12.0
---

And the related errors showed up immediately, for the first time:
---
mpt0: mpt_cam_event: 0x16
mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required).
mpt0: mpt_cam_event: 0x12
mpt0: Unhandled Event Notify Frame. Event 0x12 (ACK not required).
mpt0: mpt_cam_event: MPI_EVENT_SAS_DEVICE_STATUS_CHANGE
mpt0: mpt_cam_event: MPI_EVENT_SAS_DEVICE_STATUS_CHANGE
mpt0: mpt_cam_event: 0x16
mpt0: Unhandled Event Notify Frame. Event 0x16 (ACK not required).
--

These are device arrival events.


When the bootstrap process reached the SCSI probe, there were
no activity on the screen for about five minutes, so I was forced to use
the power off button, and after rebooting, the same symptoms were evident,
so I rebooted the machine once again, this time in verbose mode.

This debug information was being printed on the screen, one character at time,
at about 1 char/sec:

(probe8:mpt0:0:8:0): error 22

What's at target 8? It isn't happy for a variety of reasons. Oh- I see
from below- it's an SES instance that drops dead if given something at
lun 0.

(probe8:mpt0:0:8:0): Unretryable Error
---
pass0 at mpt0 bus 0 target 0 lun 0
pass0: <MAXTOR ATLAS15K2_073SAS BP00> Fixed Direct Access SCSI-5 device
> As a workaround, I disabled the APICs (hint.apic.0.disabled),
and that ~15 minutes delay at boot up, now was gone. Fine.

(BTW, 7-CURRENT has the same problem, but without that huge delay)

Do you have APIC disabled for 7-CURRENT also?


Once I was logged in the server, I proceeded to populate my ports tree,
by using portsnap(8), so, when I extracted the tarball (portsnap extract),
there was a lot of the following error message, at about 1 message per second:

mpt0: Unhandled Event Notify Frame. Event 0xe (ACK not required).

Queue Full events from the SAS firmware.


Once in a while, an error message like below, showed up:
--
(da0:mpt0:0:0:0): WRITE(10). CDB: 2a 0 1 55 6f 5f 0 0 20 0
(da0:mpt0:0:0:0): CAM Status: SCSI Status Error
(da0:mpt0:0:0:0): SCSI Status: Check Condition
(da0:mpt0:0:0:0): UNIT ATTENTION asc:29,2
(da0:mpt0:0:0:0): Scsi bus reset occurred

Somebody is reseeting the bus periodically. We (freebsd) aren't
volitionally doing this that I'm aware of here.

In order to perform those diagnostics, I had to install a SuSe Linux
Enterprise Server 9, which was also shipped with this machine)

Which is a good way of saying that LSI-Logic support isn't very
evident on FreeBSD.


After reinstalling FreeBSD, I logged remotely into the server, via ssh,
and fetched the ports snapshot again and extracted once more.

Suddenly, the screen activity ceased and the network connection timed out.

Locally, on the server, there was a lot of mpt(4) errors and warnings.
---
(da0:mpt0:0:0:0): CAM Status 0x18
(da0:mpt0:0:0:0): Retrying Command
(... and about 500 more lines like those...)

Hmm.

---

And finally, those errors from mpt(4):

---
request 0xc4c4a080:44717 timed out for ccb 0xc4e41400 (req->ccb 0xc4e41400)
request 0xc4c4b430:44718 timed out for ccb 0xc4ca5800 (req->ccb 0xc4ca5800)
request 0xc4c4cd80:44719 timed out for ccb 0xc4c52800 (req->ccb 0xc4c52800)
(... and about 300 more lines like those ...)
---

which were followed by the same number of lines like these:
---
mpt0: completing timedout/aborted req 0xc4c4a080:44717
mpt0: completing timedout/aborted req 0xc4c4b430:44718
mpt0: completing timedout/aborted req 0xc4c4cd80:44719
---

and finishing with this line:
---
mpt0: Timedout requests already complete. Interrupts may not be functioning.
---


I've seen this on Supermicro EM64T in the past on 7-current, but that
went away about 3-4 weeks ago. It really seemed to me that this was
indeed an interrupt related problem.

Yup, sounds like a mess here.
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to