We're looking to deploy CEPH on about 8 Dell servers to start, each of
which typically contain 6 to 8 harddisks with Perc RAID controllers which
support write-back cache (~512 MB usually). Most machines have between 32
and 128 GB RAM. Our questions are as follows. Please feel free to comment
on even just one of the questions below if that's the area of your
expertise/interest.


   1. Based on various "best practice" guides, they suggest putting the OS
   on a separate disk. But, we though that would not be good because we'd
   sacrifice a whole disk on each machine (~3 TB) or even two whole disks (~6
   TB) if we did a hardware RAID 1 on it. So, do people normally just
   sacrifice one whole disk? Specifically, we came up with this idea:
      1. We set up all hard disks as "pass-through" in the raid controller,
      so that the RAID controller's cache is still in effect, but the OS sees
      just a bunch of disks (6 to 8 in our case)
      2. We then do a SOFTWARE-baised RAID 1 (using Centos 6.4) for the OS
      across all 6 to 8 hardisks
      3. We then do a SOFTWARE-baised RAID 0 (using Centos 6.4) for the
      SWAP space.
      4. *Does anyone see any flaws in our idea above? We think that RAID 1
      is not computationally expensive for the machines to computer,
and most of
      the time, the OS should be in RAM. Similarly, we think RAID 0 should be
      easy for the CPU to compute, and hopefully, we won't hit much SWAP if we
      have enough RAM. And this way, we don't sacrific 1 or 2 whole disks for
      just the OS.*
   2. Based on the performance benchmark blog of Marc Nelson (
   
http://ceph.com/community/ceph-performance-part-2-write-throughput-without-ssd-journals/),
   has anything substantially changed since then? Specifically, it suggests
   that SSDs may not be really necessary if one has raid controllers with
   write-back cache. Is this still true even though the article was written
   with a version of CEPH that was over 1 year old? (Marc suggests that things
   may change with newer versions of CEPH)
   3. Based on our understanding, it would seem that CEPH can deliver very
   high throughput performance (especially for reads) if dozens and dozeons of
   hard disks are being accessed simultaneously across multiple machines. So,
   we could have several GBs throughput, right? (CEPH never advertises the
   advantage of read throughput with distributed architecture, so I'm
   wondering if I'm missing something.) If so, then is it reasonable to assume
   that one common bottleneck is the ethernet? So if we only use 1 NIC card at
   1 GBs, that'll be a major bottleneck? If so, we're thinking of trying to
   "bond" multiple 1 GB/s ethernet cards to make a "bonded" ethernet
   connection of 4 GBs (4 * 1 GB/s). But we didn't see anyone discuss this
   strategy? Is there any holes in it? Or does CEPH "automatically" take
   advantage of multiple NIC cards without us having to deal with the
   complexity (and expense of buying a new switch which supports bonding) for
   doing bonding? That is, is it possible and a good idea to have CEPH OSDs be
   set up to use specific NICs, so that we spread the load? (We read through
   the recommendation of having different NICs for front-end traffic vs
   back-end traffic, but we're not worried about network attacks -- so we're
   thinking that just creating a "big" fat ethernet pipe gives us the most
   flexibility.)
   4. I'm a little confused -- does CEPH support incremental snapshots of
   either VMs or the CEPH-FS? I saw in the release notes for "dumpling"
   release (http://ceph.com/docs/master/release-notes/#v0-67-dumpling) this
   statement: "The MDS now disallows snapshots by default as they are not
   considered stable. The command ‘ceph mds set allow_snaps’ will enable
   them." So, should I assume that we can't do incremental file-system
   snapshots in a stable fashion until further notice?

-Sidharta
-- 
*Gautam Saxena *
President & CEO
Integrated Analysis Inc.

Making Sense of Data.™
Biomarker Discovery Software | Bioinformatics Services | Data Warehouse
Consulting | Data Migration Consulting
www.i-a-inc.com  <http://www.i-a-inc.com/>
gsax...@i-a-inc.com
(301) 760-3077  office
(240) 479-4272  direct
(301) 560-3463  fax
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to