On 29 April, 2010 - Roy Sigurd Karlsbakk sent me these 10K bytes: > I got this hint from Richard Elling, but haven't had time to test it much. > Perhaps someone else could help? > > roy > > > Interesting. If you'd like to experiment, you can change the limit of the > > number of scrub I/Os queued to each vdev. The default is 10, but that > > is too close to the normal limit. You can see the current scrub limit via: > > > > # echo zfs_scrub_limit/D | mdb -k > > zfs_scrub_limit: > > zfs_scrub_limit:10 > > > > you can change it with: > > # echo zfs_scrub_limit/W0t2 | mdb -kw > > zfs_scrub_limit:0xa = 0x2 > > > > # echo zfs_scrub_limit/D | mdb -k > > zfs_scrub_limit: > > zfs_scrub_limit:2 > > > > In theory, this should help your scenario, but I do not believe this has > > been exhaustively tested in the lab. Hopefully, it will help. > > -- richard
If I'm reading the code right, it's only used when "creating" a new vdev (import, zpool create, maybe at boot).. So I took an alternate route: http://pastebin.com/hcYtQcJH (spa_scrub_maxinflight used to be 0x46 (70 decimal) due to 7 devices * zfs_scrub_limit(10) = 70..) With these lower numbers, our pool is much more responsive over NFS.. scrub: scrub in progress for 0h40m, 0.10% done, 697h29m to go Might take a while though. We've taken periodic snapshots and have snapshots from 2008, which probably has fragmented the pool beyond sanity or something.. > ----- "Bruno Sousa" <bso...@epinfante.com> skrev: > > > Indeed the scrub seems to take too much resources from a live system. > For instance i have a server with 24 disks (SATA 1TB) serving as NFS store to > a linux machine holding user mailboxes. I have around 200 users, with maybe > 30-40% of active users at the same time. > As soon as the scrub process kicks in, linux box starts to give messages like > " nfs server not available" and the users start to complain that the Outlook > gives "connection timeout". Again, as soon as the scrub process stops > everything comes to normal. > So for me, it's real issue the fact that the scrub takes so many resources of > the system, making it pretty much unusable. In my case i did a workaround, > where basically i have zfs send/receive from this server to another server > and the scrub process is now running on the second server. > I don't know if this such a good idea, given the fact that i don't know for > sure if the scrub process in the secondary machine will be usefull in case of > data corruption...but so far so good , and it's probably better than nothing. > I still remember before ZFS , that any good RAID controller would have a > background consistency check task, and such a task would be possible to > assign priority , like "low, medium, high" ...going back to ZFS what's the > possibility of getting this feature as well? > > > Just out as curiosity , the Sun OpenStorage appliances , or Nexenta based > ones, have any scrub task enabled by default ? I would like to get some > feedback from users that run ZFS appliances regarding the impact of running a > scrub on their appliances. > > > Bruno > > On 28-4-2010 22:39, David Dyer-Bennet wrote: > > On Wed, April 28, 2010 10:16, Eric D. Mudama wrote: > > On Wed, Apr 28 at 1:34, Tonmaus wrote: > > > > Zfs scrub needs to access all written data on all > disks and is usually > disk-seek or disk I/O bound so it is difficult to > keep it from hogging > the disk resources. A pool based on mirror devices > will behave much > more nicely while being scrubbed than one based on > RAIDz2. Experience seconded entirely. I'd like to repeat that I think we > need more efficient load balancing functions in order to keep > housekeeping payload manageable. Detrimental side effects of scrub > should not be a decision point for choosing certain hardware or > redundancy concepts in my opinion. While there may be some possible > optimizations, i'm sure everyone > would love the random performance of mirror vdevs, combined with the > redundancy of raidz3 and the space of a raidz1. However, as in all > systems, there are tradeoffs. The situations being mentioned are much worse > than what seem reasonable > tradeoffs to me. Maybe that's because my intuition is misleading me about > what's available. But if the normal workload of a system uses 25% of its > sustained IOPS, and a scrub is run at "low priority", I'd like to think > that during a scrub I'd see a little degradation in performance, and that > the scrub would take 25% or so longer than it would on an idle system. > There's presumably some inefficiency, so the two loads don't just add > perfectly; so maybe another 5% lost to that? That's the big uncertainty. > I have a hard time believing in 20% lost to that. > > Do you think that's a reasonable outcome to hope for? Do you think ZFS is > close to meeting it? > > People with systems that live at 75% all day are obviously going to have > more problems than people who live at 25%! > > -- > This message has been scanned for viruses and > dangerous content by MailScanner , and is > believed to be clean. > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss /Tomas -- Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/ |- Student at Computing Science, University of Umeå `- Sysadmin at {cs,acc}.umu.se _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss