I'm looking for any pointers or advice on what might have happened
to cause the following problem...

Setup:
Two X4500 / Sol 10 U5 iSCSI servers, four T1000 S10 U4 -> U5 Oracle RAC
DB heads iSCSI clients.  

iSCSI set up using zfs volumes, set shareiscsi=on, 
(slightly wierd thing) partitioned disks to get max spindles
available for "pseudo-RAID 10" performance zpools (500 gb disks,
465 usable, partitioned 115 GB for "fast" db, 345 for "archive" db,
5 gb for "utility" used for OCR and VOTE partitions in RAC).
Disks on each server set up the same way, active zpool disks
in 7 "fast" pools ("fast" partition on target 1 on each SATA
controller all together in one pool, target 2 on each in second pool, etc)
7 "archive" pools and 7 "utility" pools.  "fast" and "utility" are
zpool pseudo-RAID 10  "archive" raid-Z.  Fixed size zfs volumes
built to full capacity of each pool.

The clients were S10U4 when we first spotted this, we upgraded them
all to S10U5 as soon as we noticed that, but the problem happened
again last week.  The X4500s have been S10U5 since they were installed.


Problem:
Both servers have experienced a failure mode which initially
manifested as a Oracle RAC crash and proved via testing to be
an ignored iSCSI write to "fast" partitions.

Test case: 
(/tmp/zero is a 1-k file full of zero)
# dd if=/dev/rdsk/c2t42d0s6 bs=1k count=1
nÉçORCLDISK
FDATA_0008FDATAFDATA_0008ö*Én¨ö*íSô¼>Ú
ö*5|1+0 records in
1+0 records out
# dd of=/dev/rdsk/c2t42d0s6 if=/tmp/zero bs=1k count=1
1+0 records in
1+0 records out
# dd if=/dev/rdsk/c2t42d0s6 bs=1k count=1
nÉçORCLDISK
FDATA_0008FDATAFDATA_0008ö*Én¨ö*íSô¼>Ú
ö*5|1+0 records in
1+0 records out
#


Once this started happening, the same write behavior appears immediately
on all clients, including new ones which had not previously been
connected to the iSCSI server.

We can write a block of all 0's, or A's, out to any of the other iSCSI
devices other than the problem one, and read it back fine.  But the
misbehaving one consistently refuses to actually commit writes,
though it takes the write and returns.  All reads get the old data.

zpool status, zfs list, /var/adm/messages, everything else we look
at on the servers say they're all happy and fine.  But obviously
there's something very wrong with the particular volume / pool
which is giving us problems.

A coworker fixed it the first time by running a manual resilver,
once that was underway writes did the right thing again.  But that
was just a random shot in the dark - we saw no errors or clear
reason to resilver.

We saw it again, and it blew up the just-about-to-go-live database,
and we had to cut over to SAN storage to hit the deploy window.

It's happend on both the X4500s we were using for iSCSI, so it's
not a single point hardware issue.

I have preserved the second failed system in error mode in case
someone has ideas for more diagnostics.

I have an open support ticket, but so far no hint at a solution.

Anyone on list have ideas?


Thanks....

-george william herbert
[EMAIL PROTECTED]

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to