----- Original Message -----
From: "Atanas" <[EMAIL PROTECTED]>
To: "Mark Dotson" <[EMAIL PROTECTED]>
Cc: <freebsd-stable@freebsd.org>
Sent: Thursday, November 16, 2006 4:07 AM
Subject: Re: twa: Passthru request timed out! Resetting controller...
Mark Dotson said the following on 11/14/06 1:18 PM:
I've had continued problems with the 3ware series SATA cards and the Tyan
boards. Specifically, I have a "Tyan S5360-1U" and both a 9500S-4LP and
a 8506 series 3ware cards.
In my case the first error is different, but the 'resetting' over and
over is VERY familiar. This could be triggered by a simple file copy
from one part of a container to another; degrading the unit and
triggering the resetting crap. Note that the drives are fine, I tested
that first thing.
Sep 8 11:59:23 localhost kernel: 3w-9xxx: scsi0: WARNING: (0x06:0x002C):
Unit #1: Command (0x2a) timed out, resetting card.
Sep 8 11:59:41 localhost kernel: 3w-9xxx: scsi0: AEN: INFO
(0x04:0x005E):
Cache synchronized after power fail:unit=0.
Sep 8 11:59:41 localhost kernel: 3w-9xxx: scsi0: AEN: INFO
(0x04:0x005E):
Cache synchronized after power fail:unit=1.
I also found this problem to exist across platforms, not just FreeBSD.
For example, the excerpt above is from a CentOS box.
All tests were done with newest firmware for both card and mobo, and
using the newest drivers provided by 3ware.
Once I removed the card and drives from the Tyan system and stuck them in
pretty much ANY other system, they worked fantastically.
I don't have an answer for the "resetting problem" as of yet... 3ware and
Tyan (And my system vendor "Appro") are still trying to find my specific
problem and solve it. I believe they are currently doing the "replace
everything" method of troubleshooting.
Mark, thank you.
It's good to know that the resetting problem exist on other platforms too.
We already found out that replacing the entire box with identical one
doesn't help, so unfortunately we'll have to start replacing components by
using different brands or models.
I wouldn't like to touch the I/O subsystem (these are already loaded
production machines), so like you said, the safest bet would be to try
another motherboard.
However I don't see many Dual Opteron based boards suggested by the
3ware's compatibility list. The next one that comes in mind from that list
is Supermicro H8DC8, but it looks more like a gamers dream (High-End PCI-e
Graphics, SLI, etc. but no on-board VGA) than a server board.
I'm quite surprised that the top Opteron based motherboard manufacturer
listed in the 3ware web site motherboard compatibility docs:
http://3ware.com/products/pdf/Motherboard_compatibility_list_9550SX_2006_06.pdf
makes 2 out of 5 boards that are marked as compatible, but perform so bad
with 3ware cards.
I know what happens here in this mailing list when somebody looks for good
SATA cards (Re: 3ware, 3ware, ...), I replied myself too.
So are there any success stories with 3ware 9550SX (SATA II) and dual AMD
Opteron server boards, or it's time to go back with Intel?
Regards,
Atanas
It's time to go with another SATA2 raid controller card. I have an Areca 8
port PCI-X cotroller card (www.areca.com.tw).
Running it on a Tyan Thunder motherboard with dual AthlonMP and I've had no
issues with it yet.
I've got 8 drives on it in 2 volumes of 4 drives each. I'm getting what I
consider to be good read/write speeds to the array.
It also supports many things that 3ware did not at the time I bought it like
online volume expansion.
homer# dd if=/dev/zero of=test.file bs=65536 count=16384
16384+0 records in
16384+0 records out
1073741824 bytes transferred in 7.000588 secs (153378801 bytes/sec)
-Clay
Atanas wrote:
Has anyone experiencing this:
twa0: ERROR: (0x05: 0x2018): Passthru request timed out!: request =
0xca839d20
twa0: INFO: (0x16: 0x1108): Resetting controller...:
twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=0
...
twa0: INFO: (0x04: 0x005E): Cache synchronization completed: unit=7
twa0: INFO: (0x04: 0x0001): Controller reset occurred: resets=1
twa0: INFO: (0x16: 0x1107): Controller reset done!:
This happens on 6.2-PRERELEASE i386 (and on 6.1 since its release) on a
number of machines with the following hardware configuration:
- Tyan K8SE 2892, 2 AMD Opteron 270 CPUs, 4GB RAM
- 3ware 9550SX-8LP, 8 500GB Seagate ST3500641AS SATA drives
(configured as 8 SINGLE DISK units, aka JBOD)
All hardware components, including the server chassis, are listed in the
3ware hardware compatibility lists. It doesn't seem to be a cabling or
power issue. The controller and hard drives are already flashed to the
latest firmware revisions. I tried turning off NCQ, but it didn't make
any difference. I tried also switching the kernel from PAE to non-PAE
(reducing the usable memory to 3GB), but it didn't help either.
I have another machines with similar I/O configurations (3ware), but
with Intel motherboards and running FreeBSD-5.5, and these run fine for
about a year already. Now I'm thinking about swapping the drives between
a working Intel and AMD based box, to see where controller timeouts will
follow.
The problem happens sporadically once in a month or so and is very hard
to reproduce. Sometimes it takes several weeks until the next crash
happens, sometimes it crashes again in just a few hours.
When the thing happens, the kernel sometimes panics (most likely due to
the inconsistent filesystem state caused by the controller reset),
sometimes just hangs. It can be interrupted (I have a serial console),
but the only usable thing after that seems to be "call cpu_reset()",
followed by full (and sometimes painfully long) filesystem check.
Here are the diffs against the default GENERIC and PAE kernel
configurations:
< cpu I486_CPU
< ident GENERIC
< options INET6 # IPv6 communications protocols
< options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI
> options QUOTA
> options SMP # Symmetric MultiProcessor Kernel
> options BREAK_TO_DEBUGGER
> options DDB
> options KDB
> options KDB_UNATTENDED
> options IPFIREWALL
> options DUMMYNET
I'm attaching the dmesg.boot following the latest crash.
Regards,
Atanas
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"
_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "[EMAIL PROTECTED]"