So, while we are working on resolving this issue with Sun, let me approach this from the another perspective: what kind of controller/drive ratio would be the minimum recommended to support a functional OpenSolaris-based archival solution? Given the following:
- the vast majority of IO to the system is going to be "read" oriented, other than the initial "load" of the archive shares and possibly scrubs/re-silvering in the case of failed drives - we currently have one LSISAS3801E with two external ports; each port connects to one 23-disk JBOD - Each JBOD has the ability to take in two external SAS connections if we enable the "split-backplane" option on it which would split the disk IO path between the two connectors (12 disks on one connector, 11 on the other); we do not currently have this enabled - our current server platform only has 1 x PCIe-x8 slot available; we *could* look at changing this in the future, but I'd prefer to find a one-card solution if possible Here is the math I did that shows the current IO situation (PLEASE correct this if I am mistaken, as I am somewhat "winging" it here and my head hurts) : Based on info from: http://storageadvisors.adaptec.com/2006/07/26/sas-drive-performance/ http://en.wikipedia.org/wiki/PCI_Express http://support.wdc.com/product/kb.asp?modelno=WD1002FBYS&x=9&y=8 WD1002FBYS 1TB SATA2 7200rpm drive specs Avg seek time = 8.9ms Avg latency = 4.2ms Max transfer speed = 112 MB/s Avg transfer speed ~= 65 MB/s "Random" IO scenario (theoretical numbers): 8.9ms avg seek time + 4.2ms avg latency = 13.1 ms avg access time 1/0.0131 = 76 IOPS/drive 22 (23 - 1 spare) drives x 76 IOPS/drive = 1672 IOPS/shelf 1672 IOPS/shelf x 2 = 3344 IOPS/controller -or- 22 (23 - 1 spare) drives x 65 MB/s/drive = 1430 MB/s/shelf 1430 MB/s/shelf x 2 = 2860 MB/s controller Pure "streamed read" IO scenario (theoretical numbers): 0.0 avg seek time + 4.2ms avg latency = 4.2 ms avg access time 1/0.0042 = 238 IOPS/drive 22 (23 - 1 spare) drives x 238 IOPS/drive = 5236 IOPS/shelf 5236 IOPS/shelf x 2 = 10472 IOPS/controller -or- 22 (23 - 1 spare) drives x 112 MB/s/drive = 2464 MB/s/shelf 2464 MB/s/shelf x 2 = 4928 MB/s controller Max. bandwith of single SAS PHY interface = 270MB/s per port (300MB/s - overhead) LSISAS3801E has 2 x 4-port SAS connections. Each shelf gets a 4-port connection, so: Max controller bandwidth/shelf = 4 x 270 MB/s = 1080 MB/s Max controller bandwidth = 2 x 1080 MB/s = 2160 MB/s Max. bandwidth of PCIe x8 interface = 2GB/s Typical sustained bandwidth of PCIe x8 interface (max - 5% overhead)= 1.9GB/s Summary: Current controller cannot handle max IO load of even random IO scenario (1430 MB/s per shelf needed, controller can only handle 1080 MB/s per shelf). Also, PCIe bus can't push more than 1.9 GB/s sustained over a single slot, so we are limited by the single card. Solution: Connecting 2 x 4-port SAS connectors to one shelf (i.e. enabling split-mode) would get us 2160 MB/s / shelf. This would allow us to remove the controller as a bottleneck for all but the extreme cached read scenario, but the PCIe bus would still throttle us to 1.9 GB/s per slot. So, the controller could keep up with the shelves, but the PCIe bus would have to wait sometimes which may (?) be a "healthier" situation than overwhelming the controller. To support two shelves per controller, we could use a LSISAS31601E (4 x 4-port SAS connectors) but we would hit the PCIe bus limitation again. Moving to two (or more?) separate PCIe-x8 cards would be best, but we require us to alter our server platform. Whew. Thoughts? Comments? Suggestions? -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss