Hi, I have a pool of 22 1T SATA disks in a RAIDZ3 configuration. It is filled with files of an average size of 2MB. I filled it randomly to resemble the expected workload in production use. Problems arise when I try to scrub/resilver this pool. This operation takes the better part of a week (!). During this time the disk being resilvered is at 100% utilisation with >300 writes/s, but only 3MB/s, which is only about 3% of its best case performance. Having a window of one week with degraded redundancy is intolerable. It is quite likely that one loses more disks during this period, eventually leading to a total loss of the pool, not to mention the degraded performance during this period. In fact, in previous tests I lost a pool in a 6x11 RAIDZ2 configuration.
I skimmed through the code of resilver and found out that it just enumerates all object in the pool and checks them one by one, having maxinflight I/O-request in parallel. Because this does not take the order of data ondisk into account it leads to this pathological performance. Also I found Bug 6678033 stating that a prefetch might fix this. Now my questions: 1) Are there tunings that could speed up resilver, possibly with a negative effect on normal performance? I thought of raising recordsize to the expected filesize of 2MB. Could this help? 2) What is the state of the fix? When will it be ready? 3) Do you have any configuration hints for setting up a pool layout which might help resilver performance? (aside from using hardware RAID instead of RAIDZ) Thanks for any hints. sensille -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss