Hi!

I have a problem with ZFS and most likely the SATA PCI-X controllers.  
I run
opensolaris 2008.11 snv_98 and my hardware is Sun Netra x4200 M2 with
3 SIL3124 PCI-X with 4 eSATA ports each connected to 3 1U diskchassis
which each hold 4 SATA disks manufactured by Seagate model ES.2
(500 and 750) for a total of 12 disks. Every disk has its own eSATA  
cable
connected to the ports on the PCI-X cards.

The problem I have is that disk access seems to stop for a few seconds  
and
then continue. This happens every few seconds and the end result is  
that the
performance is terrible and unusable.

The idea was to use this box for serving iSCSI to a Windows 2003  
Server. However
with IOmeter on the Windows box and looking at Task manager i noticed  
that the
speed pulses from 90% to 0% all the time. Investigating further I  
noticed that
I get the same behavior during a simple cp on the localhost.

/usr/X11/bin/scanpci gives me this information

pci bus 0x0006 cardnum 0x01 function 0x00: vendor 0x1095 device 0x3124
  Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller

pci bus 0x0084 cardnum 0x01 function 0x00: vendor 0x1095 device 0x3124
  Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller

pci bus 0x0088 cardnum 0x01 function 0x00: vendor 0x1095 device 0x3124
  Silicon Image, Inc. SiI 3124 PCI-X Serial ATA Controller

c5*, c6* and c7* are the eSATA disks.

zpool create -f zfsatan mirror c5t0d0 c5t1d0 mirror c5t2d0 c5t3d0  
mirror c6t0d0 c6t1d0 mirror c6t2d0 c6t3d0 mirror c7t0d0 c7t1d0 mirror  
c7t2d0 c7t3d0
zfs create zfsatan/fs01

-bash-3.2# time dd if=/dev/zero bs=1024x1024x1024 count=8 of=/zfsatan/ 
fs01/storfil
8+0 records in
8+0 records out

real    2m58.863s
user    0m0.001s
sys     0m10.636s

gives me for this run a 8192/178 gives me around 46MBytes / second...  
That is really sucky speed
for 12 drives. However this speed varies since the hangups seems to  
occur on random and for a
random time.

If you look at the output from iostat -cxn 1 below you find that the  
first one is okay but the
second on the disks are in 100 %w... and it stays at 100 %w for a few  
seconds.

us sy wt id
  0 34  0 66
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t0d0
    0.0  400.9    0.0 49560.1 14.1  0.5   35.2    1.2  47  48 c5t0d0
    0.0  156.0    0.0 18327.1  4.6  0.2   29.5    1.1  17  18 c5t1d0
    0.0    7.0    0.0  132.0  2.7  0.0  386.0    4.9  56   2 c5t2d0
    0.0  293.0    0.0 36735.2 13.4  0.3   45.6    1.1  89  34 c5t3d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c8t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c8t1d0
    0.0  142.0    0.0 17409.8  4.9  0.2   34.9    1.4  20  20 c6t0d0
    0.0  350.0    0.0 44030.5 12.6  0.4   36.0    1.3  44  44 c6t1d0
    0.0  291.0    0.0 34599.7  9.6  0.3   33.1    1.2  34  35 c6t2d0
    0.0  334.0    0.0 40231.0 11.3  0.4   34.0    1.2  39  40 c6t3d0
    0.0  241.0    0.0 28210.0 18.1  0.3   75.0    1.1  77  27 c7t0d0
    0.0  317.0    0.0 38064.8 10.6  0.4   33.4    1.2  38  38 c7t1d0
    0.0  162.0    0.0 18455.7  4.5  0.2   27.6    1.1  18  18 c7t2d0
    0.0  162.0    0.0 18455.7  4.5  0.2   27.7    1.1  18  18 c7t3d0
     cpu
us sy wt id
  0 22  0 78
                    extended device statistics
    r/s    w/s   kr/s   kw/s wait actv wsvc_t asvc_t  %w  %b device
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c3t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c4t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c0t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c5t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c5t1d0
    0.0    0.0    0.0    0.0  5.0  0.0    0.0    0.0 100   0 c5t2d0
    0.0    0.0    0.0    0.0  5.0  0.0    0.0    0.0 100   0 c5t3d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c8t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c8t1d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c6t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c6t1d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c6t2d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c6t3d0
    0.0    0.0    0.0    0.0 21.0  0.0    0.0    0.0 100   0 c7t0d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c7t1d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c7t2d0
    0.0    0.0    0.0    0.0  0.0  0.0    0.0    0.0   0   0 c7t3d0


Perhaps related bugs:

Disk access stops for minutes with 100% blocking
http://bugs.opensolaris.org/view_bug.do?bug_id=6544624

si3124 driver loses interrupts.
http://bugs.opensolaris.org/view_bug.do?bug_id=6566207

Any ideas? Should I ditch the SIL3224 cards as they seem to have a bad  
rep on this
maillist? I have ordered a Sun SG-XPCI8SAS-E-Z which is an SAS PCI-X  
card but it will
cost me a lot more money just without adding any extra benefit...  
Except that it
might actually work ;)

-J


-----------------------------------------------------
Janåke Rönnblom
Phone  : +46-910-699 180
Mobile : 070-397 07 43
URL    : http://www.ronnblom.se
-----------------------------------------------------
"Those who do not understand Unix are condemned to reinvent it,  
poorly." -- Henry Spencer



_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Reply via email to