Re: [zfs-discuss] Extremely bad performance - hw failure?

Richard Elling Mon, 28 Dec 2009 08:30:16 -0800

On Dec 27, 2009, at 10:21 PM, Joe Little wrote:

I've had this happen to me too. I found some dtrace scripts at the
time that showed that the file system was spending too much time
finding available 128k blocks or the like as I was near full per each
disk, even though combined I still had 140GB left of my 3TB pool. The
SPA code I believe it was was spending too much time walking the
available pool for continguous space for new writes, and this
affecting both read and write performance dramatically (measured in
kb/sec).


I was able to alleviate the pressure so to speak by adjusting the
recordsize for the pool down to 8k (32k is likely more recommended)
and from there I could then start to clear out space. Anything below
10% available space seems to cause ZFS to start behaving poorly, and
getting down lower increases the problems. But the root cause was
metadata management on pools w/ less than 5-10% disk space left.


The better solution is to use b129 or later, where CR 6869229,

zfs should switch to shiny new metaslabs more frequently, wasintegrated.

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6869229
 -- richard

In my case, I had lots of symlinks, lots of small files, and also
dozens of snapshots. My pool was a RAID10 (aka, 3 mirror sets
striped).

On Sun, Dec 27, 2009 at 4:52 PM, Morten-Christian Bernson<m...@uib.no> wrote:

Lately my zfs pool in my home server has degraded to a state whereit can be said it doesn't work at all. Read spead is slower than Ican read from the internet on my slow dsl-line... This is comparedto just a short while ago, where I could read from it with over50mb/sec over the network.


My setup:
Running latest Solaris 10: # uname -a
SunOS solssd01 5.10 Generic_142901-02 i86pc i386 i86pc

# zpool status DATA
 pool: DATA
 state: ONLINE
config:
       NAME        STATE     READ WRITE CKSUM
       DATA        ONLINE       0     0     0
         raidz1    ONLINE       0     0     0
           c2t5d0  ONLINE       0     0     0
           c2t4d0  ONLINE       0     0     0
           c2t3d0  ONLINE       0     0     0
           c2t2d0  ONLINE       0     0     0
       spares
         c0t2d0    AVAIL
errors: No known data errors

# zfs list -r DATA
NAME                               USED  AVAIL  REFER  MOUNTPOINT
DATA                              3,78T   229G  3,78T  /DATA

All of the drives in this pool are 1.5tb western digital greendrives. I am not seeing any error messages in /var/adm/messages,and "fmdump -eV" shows no errors... However, I am seeing somesoft faults in "iostat -eEn":

 ---- errors ---
 s/w h/w trn tot device
 2   0   0   2 c0t0d0
 1   0   0   1 c1t0d0
 2   0   0   2 c2t1d0
151   0   0 151 c2t2d0
151   0   0 151 c2t3d0
153   0   0 153 c2t4d0
153   0   0 153 c2t5d0
 2   0   0   2 c0t1d0
 3   0   0   3 c0t2d0
 0   0   0   0 solssd01:vold(pid531)
c0t0d0           Soft Errors: 2 Hard Errors: 0 Transport Errors: 0
Vendor: Sun      Product: STK RAID INT     Revision: V1.0 Serial No:
Size: 31.87GB <31866224128 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
c1t0d0           Soft Errors: 1 Hard Errors: 0 Transport Errors: 0
Vendor: _NEC     Product: DVD_RW ND-3500AG Revision: 2.16 Serial No:
Size: 0.00GB <0 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 1 Predictive Failure Analysis: 0
c2t1d0           Soft Errors: 2 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: SAMSUNG HD753LJ  Revision: 1113 Serial No:
Size: 750.16GB <750156373504 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
c2t2d0           Soft Errors: 151 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: WDC WD15EADS-00R Revision: 0A01 Serial No:
Size: 1500.30GB <1500301909504 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 151 Predictive Failure Analysis: 0
c2t3d0           Soft Errors: 151 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: WDC WD15EADS-00R Revision: 0A01 Serial No:
Size: 1500.30GB <1500301909504 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 151 Predictive Failure Analysis: 0
c2t4d0           Soft Errors: 153 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: WDC WD15EADS-00R Revision: 0A01 Serial No:
Size: 1500.30GB <1500301909504 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 153 Predictive Failure Analysis: 0
c2t5d0           Soft Errors: 153 Hard Errors: 0 Transport Errors: 0
Vendor: ATA      Product: WDC WD15EADS-00R Revision: 0A01 Serial No:
Size: 1500.30GB <1500301909504 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 153 Predictive Failure Analysis: 0
c0t1d0           Soft Errors: 2 Hard Errors: 0 Transport Errors: 0
Vendor: Sun      Product: STK RAID INT     Revision: V1.0 Serial No:
Size: 31.87GB <31866224128 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 2 Predictive Failure Analysis: 0
c0t2d0           Soft Errors: 3 Hard Errors: 0 Transport Errors: 0
Vendor: Sun      Product: STK RAID INT     Revision: V1.0 Serial No:
Size: 1497.86GB <1497859358208 bytes>
Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0
Illegal Request: 3 Predictive Failure Analysis: 0

I am curious as to why the counter for "Illegal request" goes upall the time. The machine was rebooted ~11 hours ago, and it goesup all the time when I try to use the pool...

The machine is a quite powerful one, and top shows no cpu load, noiowait and plenty of available memory. The machine basicly doesn'tdo anything at the moment, still it can take several minutes tocopy a 300mb file from somewhere in the pool to /tmp/...

# top

last pid: 1383; load avg: 0.01, 0.00, 0.00; up0+10:47:5701:39:17

55 processes: 54 sleeping, 1 on cpu

CPU states: 99.0% idle, 0.0% user, 1.0% kernel, 0.0% iowait,0.0% swap

Kernel: 193 ctxsw, 3 trap, 439 intr, 298 syscall, 3 flt

Memory: 8186M phys mem, 4699M free mem, 2048M total swap, 2048Mfree swap

I thought I might have run into problems described here on theforums with the ARC and fragmentation, but it doesn't seem so:

# echo "::arc"|mdb -k
hits                      =    490044
misses                    =     37004
demand_data_hits          =    282392
demand_data_misses        =      2113
demand_metadata_hits      =    191757
demand_metadata_misses    =     21034
prefetch_data_hits        =       851
prefetch_data_misses      =     10265
prefetch_metadata_hits    =     15044
prefetch_metadata_misses  =      3592
mru_hits                  =     73416
mru_ghost_hits            =        16
mfu_hits                  =    401500
mfu_ghost_hits            =        24
deleted                   =      1555
recycle_miss              =         0
mutex_miss                =         0
evict_skip                =      1487
hash_elements             =     37032
hash_elements_max         =     37045
hash_collisions           =     10094
hash_chains               =      4365
hash_chain_max            =         4
p                         =      3576 MB
c                         =      7154 MB
c_min                     =       894 MB
c_max                     =      7154 MB
size                      =      1797 MB
hdr_size                  =   8002680
data_size                 = 1866272256
other_size                =  10519712
l2_hits                   =         0
l2_misses                 =         0
l2_feeds                  =         0
l2_rw_clash               =         0
l2_read_bytes             =         0
l2_write_bytes            =         0
l2_writes_sent            =         0
l2_writes_done            =         0
l2_writes_error           =         0
l2_writes_hdr_miss        =         0
l2_evict_lock_retry       =         0
l2_evict_reading          =         0
l2_free_on_write          =         0
l2_abort_lowmem           =         0
l2_cksum_bad              =         0
l2_io_error               =         0
l2_size                   =         0
l2_hdr_size               =         0
memory_throttle_count     =         0
arc_no_grow               =         0
arc_tempreserve           =         0 MB
arc_meta_used             =       372 MB
arc_meta_limit            =      1788 MB
arc_meta_max              =       372 MB

I then tried to start a scrub, and it seems like it will takeforever... It used to take a few hours, now it says it will be donein almost 700 hours:

 scrub: scrub in progress for 4h43m, 0,68% done, 685h2m to go

Does anyone have any clue as to what is happening, and what I cando? If a disk is failing without the OS noticing, it would be niceto find a way to know which drive it is, and get it exchangedbefore it's too late.


All help is apreciated...

Yours sincerly,
Morten-Christian Bernson
--
This message posted from opensolaris.org
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Extremely bad performance - hw failure?

Reply via email to