Very interesting... Well, lets see if we can do the numbers for my setup.
>From a previous post of mine: [i]This is my exact breakdown (cheap disks on cheap bus :P) : PCI-E 8X 4-port ESata Raid Controller. 4 x ESata to 5Sata Port multipliers (each connected to a ESata port on the controller). 20 x Samsung 1TB HDD's. (each connected to a Port Multiplier). The PCIE 8x port gives me 4GBps, which is 32Gbps. No problem there. Each ESata port guarantees 3Gbps, therefore 12Gbps limit on the controller. Each PM can give up to 3Gbps, which is shared amongst 5 drives. According to Samsungs site, max read speed is 250MBps, which translates to 2Gbps. Multiply by 5 drives gives you 10Gbps. Which is 333% of the PM's capability. So the drives arent likely to hit max read speed for long lengths of time, especially during rebuild time. So the bus is going to be quite a bottleneck. Lets assume that the drives are 80% full. Thats 800GB that needs to be read on each drive, which is (800x9) 7.2TB. Best case scenario, we can read 7.2TB at 3Gbps = 57.6 Tb at 3Gbps = 57600 Gb at 3Gbps = 19200 seconds = 320 minutes = 5 Hours 20 minutes. Even if it takes twice that amount of time, Im happy. Initially I had been thinking 2 PM's for each vdev. But now Im thinking maybe split it wide as best I can ([2Ddisks per PM] x 2, [2Ddisks&1Pdisk per PM] x 2) for each vdev. It'll give the best possible speed, but still wont max out the HDD's. I've never actually sat and done the math before. Hope its decently accurate :)[/i] My scenario, as from Erik's post: Scenario: I have 10 1TB disks in a raidz2, and I have 128k slab sizes. Thus, I have 16k of data for each slab written to each disk. (8x16k data + 32k parity for a 128k slab size). So, each IOPS gets to reconstruct 16k of data on the failed drive. It thus takes about 1TB/16k = 62.5e6 IOPS to reconstruct the full 1TB drive. Lets assume the drives are at 95% capacity, which is a pretty bad scenario. So thats 7600GB, which is 60800Gb. There will be no other IO while a rebuild is going. Best Case: I'll read at 12Gbps, & write at 3Gbps (4:1). I read 128K for every 16K I write (8:1). Hence the read bandwidth will be the bottleneck. So 60800Gb @ 12Gbps is 5066s which is 84m27s (Never gonna happen). A more realistic read of 1.5Gbps gives me 40533s, which is 675m33s, which is 11h15m33s. Which is a more realistic time to read 7.6TB. -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss