Hi  

On 08.03.2015 04:32, Anthony D'Atri wrote: 

> 1) That's an
awful lot of mons. Are they VM's or something? My sense is that mons >5
have diminishing returns at best.

We have application cluster and ceph
as storage solution, cluster consists of six servers, so we've installed
monitor on every one of them, to have ceph cluster sane (quorum) if
server or two of them goes down. As far as I know there is no limit or
recommended number, could you please point out some issue with higher
number of mons? We are going to deploy another mons on new storage
nodes, or is it not necessary/recommended to have mon on node with
osds?

> 2) Only two OSD nodes? Assume you aren't running 3 copies of
data or racks.

 for now, only 2 copies. We are not running vms ,so we
could lost few objects without any issue (objects are directly accessed
with librados), maybe later in final state which means 6 storage nodes
we switch settings to 3 copies 

3) The new n> tion. 
> 
> yes fewer,
but OSD are weighted according disk size (we are using
https://github.com/ceph/ceph/blob/master/src/ceph-osd-prestart.
ation),
and node weight should be sum of osd's weight, there should be no
problem. But regarding number of OSD's 

we have 36 and all big clusters
around on internet have like 24 osd per node and I was not able to find
any best practice regarding count of osds per node and we are going to
reduce count of them. 

There are two options:

1) move few discs to new
node, to have 3 nodes each 30 osd, this will lower data density per
node

or

2) migrate discs to sw raid0, 2 discs per raid, from 36 it
will become 18osd with better io performance per osd and same data
density (raid is not recommended at all, but as redundancy solution, why
not for performance use case? )

4) I've had experience tripling the
size of a cluster in one shot, and in backfilling a whole rack of 100+
OSD's in one shot. Cf. BÖC's 'Veteran of the Psychic Wars'. I > scrubs /
deep-scrubs, throttling the usual backfill / recovery values, including
setting recovery op priority as low as 1 for the duration. 
> 
>
deep-scrub is disabled (we have short ttl or sliding window for our
data, most of objects are going to be deleted in 3 days anyway, you can
imagine our ceph use case as two days fifo queue ) and recovery values
already lowered, because ceph is sometimes throwing discs away (an
ing
them back), so basically it is recovering most of the time.This issue
with unresponsive osds is reason why we want to reduce count of them. We
are suspecting that we have too many osd per node.  

 Deploy one OSD at
a time. Yes this will cause data to move more than once. But it will
also minimize your exposure to as-of-yet undiscovered problems with the
new hardware, and the magnitude of peering storms. And thus client
impact. One OSD on each new system, sequentially. Check the weights in
the> bsp; 
> 
> Thanks for that, so we will start with one new osd per
night.
> 
> _______________________________________________ ceph-users
mailing list ceph-users@lists.ceph.com
[1]http://lists.ceph.com/list
-users-ceph.com  

> 


Links:
------
[1]
mailto:ceph-users@lists.ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to