On Mon, Mar 21, 2011 at 3:45 PM, Roy Sigurd Karlsbakk <r...@karlsbakk.net> wrote: > > > Our main backups storage server has 3x 8-drive raidz2 vdevs. Was > > replacing the 500 GB drives in one vdev with 1 TB drives. The last 2 > > drives took just under 300 hours each. :( The first couple drives > > took approx 150 hours each, and then it just started taking longer and > > longer for each drive. > > That's strange indeed. I just replaced 21 drives (seven 2TB drives in three > raidz2 VDEVs) drives with 3TB ones, and resilver times were quite stable, > until the last replace, which was a bit faster. Have you checked 'iostat > -en'? If one (or more) of the drives are having i/o errors, that may slow > down the whole pool.
We've production servers with 9 vdev's (mirrored) doing `zfs send` daily to backup servers with with 7 vdev's (each 3-disk raidz1). Some backup servers that receive datasets with lots of small files (email/web) keep getting worse resilver times. # zpool status pool: backup state: DEGRADED status: One or more devices has been removed by the administrator. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Online the device using 'zpool online' or replace the device with 'zpool replace'. scrub: resilver in progress for 646h13m, 100.00% done, 0h0m to go config: NAME STATE READ WRITE CKSUM backup DEGRADED 0 0 0 raidz1-0 ONLINE 0 0 0 c4t2d0 ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 c4t5d0 ONLINE 0 0 0 c4t6d0 ONLINE 0 0 0 c4t7d0 ONLINE 0 0 0 raidz1-2 DEGRADED 0 0 0 c4t8d0 ONLINE 0 0 0 spare-1 DEGRADED 0 0 216M c4t9d0 REMOVED 0 0 0 c4t1d0 ONLINE 0 0 0 874G resilvered c4t10d0 ONLINE 0 0 0 raidz1-3 ONLINE 0 0 0 c4t11d0 ONLINE 0 0 0 c4t12d0 ONLINE 0 0 0 c4t13d0 ONLINE 0 0 0 raidz1-4 ONLINE 0 0 0 c4t14d0 ONLINE 0 0 0 c4t15d0 ONLINE 0 0 0 c4t16d0 ONLINE 0 0 0 raidz1-5 ONLINE 0 0 0 c4t17d0 ONLINE 0 0 0 c4t18d0 ONLINE 0 0 0 c4t19d0 ONLINE 0 0 0 raidz1-6 ONLINE 0 0 0 c4t20d0 ONLINE 0 0 0 c4t21d0 ONLINE 0 0 0 c4t22d0 ONLINE 0 0 0 spares c4t1d0 INUSE currently in use # zpool list backup NAME SIZE USED AVAIL CAP HEALTH ALTROOT backup 19.0T 18.7T 315G 98% DEGRADED - Even though the pool is at 98% utilization, it's usually not a problem if the production server is sending datasets which hold VM machines. Here we seem to be clearly maxing out on IOPS of the disks in the raidz1-2 vdev. It seems logical to go back to mirrors for this kind of workload (lots of small files, nothing sequential). What I cannot explain is why c4t1d0 is doings lots of reads, besides the expected reads. It seems to be holding back the resilver while I would expect only c4t9d0 and c4t10d0 should be reading. I do not understand the ZFS internals that are making this happen. Can anyone explain that? The server is doing nothing but the resilver (not even receiving new zfs send's). By the way, since this is OpenSolaris 2009.6, there is a nasty bug that if I enable fmd, it'll record billions of checksums errors until the disk is full (so I've had to disable it while resilvering is happening). # iostat -Xn 1 | egrep '(c4t(8|10|1)d0|r/s)' r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 35.2 14.9 907.9 135.8 0.0 0.4 0.1 8.6 1 12 c4t1d0 44.7 4.0 997.6 78.3 0.0 0.3 0.1 5.8 1 10 c4t8d0 44.8 4.0 997.6 78.3 0.0 0.3 0.1 5.8 1 10 c4t10d0 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 98.6 46.9 2628.2 52.7 0.0 1.3 0.2 8.6 2 39 c4t1d0 146.5 0.0 2739.2 0.0 0.0 0.8 0.1 5.1 2 25 c4t8d0 144.5 0.0 2805.9 0.0 0.0 0.7 0.1 5.1 2 26 c4t10d0 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 108.6 45.7 2809.1 50.7 0.0 1.1 0.1 6.9 2 35 c4t1d0 146.2 0.0 2624.2 0.0 0.0 0.3 0.1 2.3 1 18 c4t8d0 149.2 0.0 2737.0 0.0 0.0 0.3 0.1 2.3 1 16 c4t10d0 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 113.0 23.0 3226.9 28.0 0.0 1.2 0.1 8.9 2 40 c4t1d0 159.0 0.0 3286.9 0.0 0.0 0.6 0.1 3.9 2 24 c4t8d0 176.0 0.0 3545.9 0.0 0.0 0.5 0.1 3.0 2 26 c4t10d0 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 147.4 34.4 3888.9 52.1 0.0 1.5 0.2 8.3 3 43 c4t1d0 181.7 0.0 3515.1 0.0 0.0 0.6 0.1 3.1 2 24 c4t8d0 193.5 0.0 3489.9 0.0 0.0 0.6 0.2 3.3 4 22 c4t10d0 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 151.2 33.9 3792.7 42.7 0.0 1.5 0.1 7.9 1 36 c4t1d0 197.5 0.0 3856.9 0.0 0.0 0.4 0.1 2.3 2 19 c4t8d0 164.6 0.0 3928.1 0.0 0.0 0.7 0.1 4.2 1 24 c4t10d0 r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 171.0 90.0 4426.3 121.5 0.0 1.3 0.1 4.9 3 51 c4t1d0 184.0 0.0 4426.8 0.0 0.0 0.7 0.1 4.0 2 30 c4t8d0 195.0 0.0 4430.3 0.0 0.0 0.7 0.1 3.7 2 32 c4t10d0 ^C Anyone else with over 600 hours of resilver time? :-) Thank you, Giovanni Tirloni (gtirl...@sysdroid.com) _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss