On Wed, May 8, 2019 at 9:31 AM Karl Denninger <k...@denninger.net> wrote:
> I have a system here with about the same amount of net storage on it as > you did. It runs scrubs regularly; none of them take more than 8 hours > on *any* of the pools. The SSD-based pool is of course *much* faster > but even the many-way RaidZ2 on spinning rust is an ~8 hour deal; it > kicks off automatically at 2:00 AM when the time comes but is complete > before noon. I run them on 14 day intervals. > Damn, I wish our scrubs took 8 hours. :) Storage pool 1: 90 drives in 6-disk raidz2 vdevs (mix of 2 TB and 4 TB SATA). 45 hours to scrub. Storage pool 2: 90 drives in 6-disk raidz2 vdevs (mix of 2 TB and 4 TB SATA). 33 hours to scrub. Storage pool 3: 24 drives in 6-disk raidz2 vdevs (mix of 2 TB and 4 TB SATA). 134 hours to scrub. Storage pool 4: 24 drives in 6-disk raidz2 vdevs (mix of 1 TB, 2 TB, 4 TB SATA). Dedupe enabled. 256 hours to scrub. Storage pool 5: 90 drives in 6-disk raidz2 vdevs (mix of 2 TB and 4 TB SATA). Dedupe enabled. Takes about 6 weeks to resilver a drive, and it's constantly resilvering drives these days as it's the oldest pool, and all the drives are dying. :D Pools 1, 3, and 4 are in DC1. Pools 2 and 5 are in DC2 across town. Pool 1 sends snapshots to pool 2. Pools 3 and 4 send snapshots to pool 5. These pools are highly fragmented. :) > If you have pool(s) that are taking *two weeks* to run a scrub IMHO > either something is badly wrong or you need to rethink organization of > the pool structure -- that is, IMHO you likely either have a severe > performance problem with one or more members or an architectural problem > you *really* need to determine and fix. If a scrub takes two weeks > *then a resilver could conceivably take that long as well* and that's > *extremely* bad as the window for getting screwed is at its worst when a > resilver is being run. > Thankfully, ours are strictly storage for backups of other systems, so as long as the nightly backups complete successfully before 6 am, we're not worried about performance. :) And we do have plans to replace pools 2 and 5 to remove dedupe from the equation. There's not a lot we can do about the fragmentation issue, as these servers all run rsync backups from 200-odd other servers, and remove the oldest snapshot every night. So, while a 2-week scrub may be horrible, it all depends on the use-case. If these were direct storage systems for in-production servers, then I'd be worried. But as redundant backup systems (3 copies of everything, in 3 separate locations around the city), I'm not too worried. Yet. :D -- Freddie Cash fjwc...@gmail.com _______________________________________________ freebsd-stable@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"