Re: [zfs-discuss] Performance drop during scrub?

Tomas Ögren Thu, 29 Apr 2010 04:53:01 -0700

On 29 April, 2010 - Roy Sigurd Karlsbakk sent me these 10K bytes:

> I got this hint from Richard Elling, but haven't had time to test it much. 
> Perhaps someone else could help? 
> 
> roy 
> 
> > Interesting. If you'd like to experiment, you can change the limit of the 
> > number of scrub I/Os queued to each vdev. The default is 10, but that 
> > is too close to the normal limit. You can see the current scrub limit via: 
> > 
> > # echo zfs_scrub_limit/D | mdb -k 
> > zfs_scrub_limit: 
> > zfs_scrub_limit:10 
> > 
> > you can change it with: 
> > # echo zfs_scrub_limit/W0t2 | mdb -kw 
> > zfs_scrub_limit:0xa = 0x2 
> > 
> > # echo zfs_scrub_limit/D | mdb -k 
> > zfs_scrub_limit: 
> > zfs_scrub_limit:2 
> > 
> > In theory, this should help your scenario, but I do not believe this has 
> > been exhaustively tested in the lab. Hopefully, it will help. 
> > -- richard


If I'm reading the code right, it's only used when "creating" a new vdev
(import, zpool create, maybe at boot).. So I took an alternate route:

http://pastebin.com/hcYtQcJH

(spa_scrub_maxinflight used to be 0x46 (70 decimal) due to 7 devices *
zfs_scrub_limit(10) = 70..)

With these lower numbers, our pool is much more responsive over NFS..

 scrub: scrub in progress for 0h40m, 0.10% done, 697h29m to go

Might take a while though. We've taken periodic snapshots and have
snapshots from 2008, which probably has fragmented the pool beyond
sanity or something..

> ----- "Bruno Sousa" <bso...@epinfante.com> skrev: 
> 
> 
> Indeed the scrub seems to take too much resources from a live system. 
> For instance i have a server with 24 disks (SATA 1TB) serving as NFS store to 
> a linux machine holding user mailboxes. I have around 200 users, with maybe 
> 30-40% of active users at the same time. 
> As soon as the scrub process kicks in, linux box starts to give messages like 
> " nfs server not available" and the users start to complain that the Outlook 
> gives "connection timeout". Again, as soon as the scrub process stops 
> everything comes to normal. 
> So for me, it's real issue the fact that the scrub takes so many resources of 
> the system, making it pretty much unusable. In my case i did a workaround, 
> where basically i have zfs send/receive from this server to another server 
> and the scrub process is now running on the second server. 
> I don't know if this such a good idea, given the fact that i don't know for 
> sure if the scrub process in the secondary machine will be usefull in case of 
> data corruption...but so far so good , and it's probably better than nothing. 
> I still remember before ZFS , that any good RAID controller would have a 
> background consistency check task, and such a task would be possible to 
> assign priority , like "low, medium, high" ...going back to ZFS what's the 
> possibility of getting this feature as well? 
> 
> 
> Just out as curiosity , the Sun OpenStorage appliances , or Nexenta based 
> ones, have any scrub task enabled by default ? I would like to get some 
> feedback from users that run ZFS appliances regarding the impact of running a 
> scrub on their appliances. 
> 
> 
> Bruno 
> 
> On 28-4-2010 22:39, David Dyer-Bennet wrote: 
> 
> On Wed, April 28, 2010 10:16, Eric D. Mudama wrote: 
> 
> On Wed, Apr 28 at  1:34, Tonmaus wrote: 
> 
> 
> 
> Zfs scrub needs to access all written data on all
> disks and is usually
> disk-seek or disk I/O bound so it is difficult to
> keep it from hogging
> the disk resources.  A pool based on mirror devices
> will behave much
> more nicely while being scrubbed than one based on
> RAIDz2. Experience seconded entirely. I'd like to repeat that I think we
> need more efficient load balancing functions in order to keep
> housekeeping payload manageable. Detrimental side effects of scrub
> should not be a decision point for choosing certain hardware or
> redundancy concepts in my opinion. While there may be some possible 
> optimizations, i'm sure everyone
> would love the random performance of mirror vdevs, combined with the
> redundancy of raidz3 and the space of a raidz1.  However, as in all
> systems, there are tradeoffs. The situations being mentioned are much worse 
> than what seem reasonable
> tradeoffs to me.  Maybe that's because my intuition is misleading me about
> what's available.  But if the normal workload of a system uses 25% of its
> sustained IOPS, and a scrub is run at "low priority", I'd like to think
> that during a scrub I'd see a little degradation in performance, and that
> the scrub would take 25% or so longer than it would on an idle system. 
> There's presumably some inefficiency, so the two loads don't just add
> perfectly; so maybe another 5% lost to that?  That's the big uncertainty. 
> I have a hard time believing in 20% lost to that.
> 
> Do you think that's a reasonable outcome to hope for?  Do you think ZFS is
> close to meeting it?
> 
> People with systems that live at 75% all day are obviously going to have
> more problems than people who live at 25%! 
> 
> -- 
> This message has been scanned for viruses and 
> dangerous content by MailScanner , and is 
> believed to be clean. 
> _______________________________________________ 
> zfs-discuss mailing list 
> zfs-discuss@opensolaris.org 
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss 
> 

> _______________________________________________
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
_______________________________________________
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Performance drop during scrub?

Reply via email to