Hi Christian, On 18 April 2014 12:28, Christian Balzer <ch...@gol.com> wrote: > > On Fri, 18 Apr 2014 11:34:15 +1000 Blair Bethwaite wrote: > > So the PERC 710p, whilst not having the native JBOD mode of the > > underlying LSI 2208 chipset, does allow per- virtual-disk cache and > > read-ahead mode settings. It also does support "Cut-Through IO" (CTIO), > > apparently enabled when the virtual-disk is set to no read-ahead and > > write-through caching. So my draft plan is that for our hardware we'll > > have 12x single-RAID0 virtual-disks, the 3 ssds will be set for CTIO. > > > Ah, I've seen similar stuff with LSI 2108s, but not the CTIO bit. > What tends to be annoying about these single drive RAID0 virtual disks is > that the real drive is shielded from the OS. And with a cluster of your > size SMART data can and will be immensely helpful.
Yes, I think earlier PERCs were bad for that, but at least the 700 and 800 lines advertise SMART support and it works. The smartctl man page says: ===== megaraid,N - [Linux only] the device consists of one or more SCSI/SAS disks con‐ nected to a MegaRAID controller. The non-negative integer N (in the range of 0 to 127 inclusive) denotes which disk on the controller is monitored. Use syntax such as: smartctl -a -d megaraid,2 /dev/sda smartctl -a -d megaraid,0 /dev/sdb This interface will also work for Dell PERC controllers. The following /dev/XXX entry must exist: For PERC2/3/4 controllers: /dev/megadev0 For PERC5/6 controllers: /dev/megaraid_sas_ioctl_node ===== On our current set of R720XDs (no SSD in these) I find "smartctl -d megaraid,0 -a /dev/sda" through "smartctl -d megaraid,11 -a /dev/sda" give me details of the 12x front-bay data drives (NB: the actual device doesn't seem to matter so long as it exists), whilst "smartctl -d megaraid,12 -a /dev/sda" and "smartctl -d megaraid,13 -a /dev/sda" are the internal drives (which we have in a RAID1 for OS etc). > > Current use case is RBD volumes for working data and we're looking at > > integrating a cold-storage option for long-term durability of those, so > > our replication is mainly about availability. I assume 3x replication is > > more relevant for radosgw? There was an interesting discussion a while > > back about calculating data-loss probabilities under certain conditions > > but it didn't seem to have a definitive end... > > > You're probably thinking about the thread called > "Failure probability with largish deployments" that I started last year. > > You might want to revisit that thread, the reliability modeling software > by Inktank was coming up with decent enough numbers for both RAID6 and a > replication factor of 3. > And as Kyle in the last post to the thread said, it could do with some > improvements in that modeling, as it doesn't consider the number of disks > and assumes full speed recovery with Ceph. > > Either way, a replication of 2 is more akin to RAID5 and once your cluster > becomes half full 2TB would have to be replicated in case of disk > failure before it is safe again. And my experience tells me that another > disk failure in that recovery window is just a question of time. Murphy's Law. I guess this is why the Gluster/RHS folks suggest host-RAID as well. EC really seems to be the way to go to avoid all this, but then the question becomes "how big does my cache-tier need to be?". > Heck, the CERN folks went for 4x replication for really valuable data. > > For cold or lukewarm storage consider consider RAID6 backed OSDs, no SSD > journals, 2x replication. > Slow to write to (IOPS wise), but much denser, cheaper than a 3x replicated > OSD. And if you have a few of those, still impressive reads. ^o^ We are considering something that dumps RBDs (tagged for durability) out to tape. Would likely do this asynchronously to start with, perhaps triggered on "detach". The icing would be a way to then transparently zero the RBD but recall it if needed. -- Cheers, ~Blairo _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com