On 05/05/17 21:32, Alejandro Comisario wrote: > Thanks David! > Any one ? more thoughts ? > > On Wed, May 3, 2017 at 3:38 PM, David Turner <drakonst...@gmail.com > <mailto:drakonst...@gmail.com>> wrote: > > Those are both things that people have done and both work. > Neither is optimal, but both options work fine. The best option > is to definitely just get a third node now as you aren't going to > be getting it for additional space from it later. Your usable > space between a 2 node size 2 cluster and a 3 node size 3 cluster > is identical. > > If getting a third node is not possible, I would recommend a size > 2 min_size 2 configuration. You will block writes if either of > your nodes or any copy of your data is down, but you will not get > into an inconsistent state that can happen with min_size of 1 (and > you can always set the min_size of a pool to 1 on the fly to > perform maintenance). If you go with the option to use the > failure domain of OSDs instead of hosts and have size 3, then a > single node going down will block writes into your cluster. The > only you gain from this is having 3 physical copies of the data > until you get a third node, but a lot of backfilling when you > change the crush rule. > > A more complex option that I think would be a better solution than > your 2 options would be to create 2 hosts in your crush map for > each physical host and split the OSDs in each host evenly between > them. That way you can have 2 copies of data in a given node, but > never all 3 copies. You have your 3 copies of data and guaranteed > that not all 3 are on the same host. Assuming min_size of 2, you > will still block writes if you restart either node. > Smart idea. Or if you have space, size 4 min_size 2 and then you can still lose a node. And you might think that's more space, but in a way it isn't... if you count free space reserved for recovery. If your size 3 double nodes die, then the other has to recover to size 2 and then it'll use the same space as the size 4 pool. If the size 4 pool loses a node, it won't be able to recover... it'll stay size 2, which is what your size 3 pool would have been after recovery. So it's like it's pre-recovered. But you probably get a bit more write latency in this setup.
> If modifying the hosts in your crush map doesn't sound daunting, > then I would recommend going that route... For most people that is > more complex than they'd like to go and I would say size 2 > min_size 2 would be the way to go until you get a third node. > #my2cents > > On Wed, May 3, 2017 at 12:41 PM Maximiliano Venesio > <mass...@nubeliu.com <mailto:mass...@nubeliu.com>> wrote: > > Guys hi. > > I have a Jewel Cluster composed by two storage servers which > are configured on > the crush map as different buckets to store data. > > I've to configure two new pools on this cluster with the certainty > that i'll have to add more servers in a short term. > > Taking into account that the recommended replication size for > every > pool is 3, i'm thinking in two possible scenarios. > > 1) Set the replica size in 2 now, and in the future change the > replica > size to 3 on a running pool. > Is that possible? Can i have serious issues with the rebalance > of the > pgs, changing the pool size on the fly ? > > 2) Set the replica size to 3, and change the ruleset to > replicate by > OSD instead of HOST now, and in the future change this rule in the > ruleset to replicate again by host in a running pool. > Is that possible? Can i have serious issues with the rebalance > of the > pgs, changing the ruleset in a running pool ? > > Which do you think is the best option ? > > > Thanks in advanced. > > > Maximiliano Venesio > Chief Cloud Architect | NUBELIU > E-mail: massimo@nubeliu.comCell: +54 9 11 3770 1853 > <tel:+54%209%2011%203770-1853> > _ > www.nubeliu.com <http://www.nubeliu.com> > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> > > > > > -- > *Alejandro Comisario* > *CTO | NUBELIU* > E-mail: alejan...@nubeliu.com <mailto:alejan...@nubeliu.com>Cell: +54 > 9 11 3770 1857 > _ > www.nubeliu.com <http://www.nubeliu.com/> > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- -------------------------------------------- Peter Maloney Brockmann Consult Max-Planck-Str. 2 21502 Geesthacht Germany Tel: +49 4152 889 300 Fax: +49 4152 889 333 E-mail: peter.malo...@brockmann-consult.de Internet: http://www.brockmann-consult.de --------------------------------------------
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com