Your MON nodes are separate hardware from the OSD nodes, right?  If so,
with replication=2, you should be able to shut down one of the two OSD
nodes, and everything will continue working.  Since it's for
experimentation, I wouldn't deal with the extra hassle of replication=4 and
custom CRUSH rules to make it work.  If you have your heart set on that, it
should be possible.  I'm no CRUSH expert though, so I can't say for certain
until I've actually done it.

I'm a bit confused why your performance is horrible though.  I'm assuming
your HDDs are 7200 RPM.  With the SSD journals and replication=3, you won't
have a ton of IO, but you shouldn't have any problem doing > 100 MB/s with
4 MB blocks.  Unless your SSDs are very low quality, the HDDs should be
your bottleneck.

On Fri, Aug 8, 2014 at 10:24 PM, John Morris <> wrote:

> Our experimental Ceph cluster is performing terribly (with the operator to
> blame!), and while it's down to address some issues, I'm curious to hear
> advice about the following ideas.
> The cluster:
> - two disk nodes (6 * CPU, 16GB RAM each)
> - 8 OSDs (4 each)
> - 3 monitors
> - 10Gb front + back networks
> - 2TB Enterprise SATA drives
> - HP RAID controller w/battery-backed cache
> - one SSD journal drive for each two OSDs
> First, I'd like to play with taking one machine down, but with the other
> node continuing to serve the cluster.  To maintain redundancy in this
> scenario, I'm thinking of setting the pool size to 4 and the min_size to 2,
> with the idea that a proper CRUSH map should always keep two copies on each
> disk node.  Again, *this is for experimentation* and probably raises red
> flags for production, but I'm just asking if it's *possible*:  Could one
> node go down and the other node continue to serve r/w data?  Any anecdotes
> of performance differences between size=4 and size=3 in other clusters?
> Second, does it make any sense to divide the CRUSH map into an extra level
> for the SSD disks, which each hold journals for two OSDs?  This might
> increase redundancy in case of a journal disk failure, but ISTR something
> about too few OSDs in a bucket causing problems with the CRUSH algorithm.
> Thanks-
>         John
> _______________________________________________
> ceph-users mailing list
ceph-users mailing list

Reply via email to