On Thu, May 12, 2016 at 2:36 PM, Stephen Mercier <
stephen.merc...@attainia.com> wrote:

> I'm trying to setup a crush rule, and I was hoping you guys could clarify
> something for me.
>
> I have 4 storage nodes across 2 cabinets. (2x2)
>
> I have the crush hierarchy setup to reflect this layout (as follows):
>
> rack cabinet2 { id -3 # do not change unnecessarily # weight xxxx alg
> straw hash 0 # rjenkins1 item cephstore04 weight xxxx item cephstore02
> weight xxxx
> }
> rack cabinet1 { id -3 # do not change unnecessarily # weight xxxx alg
> straw hash 0 # rjenkins1 item cephstore03 weight xxxx item cephstore01
> weight xxxx } root default { id -1 # do not change unnecessarily # weight
> xxxx alg straw hash 0 # rjenkins1 item cabinet2 weight xxxx item cabinet1
> weight xxxx }
> The default ruleset is as follows: (Big surprise!!)
> rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10
> step take default step choose firstn 0 type osd step emit }
>
> If I want this to ensure that there is at least 1 copy of the data in each
> cabinet, would I just change it to:
>
> rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10
> step take default step choose firstn 0 type rack step emit }
>
> Or should it be:
>
> rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10
> step take default step chooseleaf firstn 0 type rack step emit }
>


If you only wan two copies, the chooseleaf variant is correct.

Assuming you want three copies, neither of these is quite right. The use of
"firstn 0" means "take the requested number of replicas in this selection
step", so both of these would be asking for 3 racks; which obviously won't
work.
(The "choose" variant doesn't work either, because you're telling it to
select N racks and then emit those as the object locations! You'd need to
add in a chooseleaf or set of choose calls underneath it.)



>
> Or is there something more complicated I should be doing? I took a look at
> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg19140.html and
> it sounds like this is what I want, but I've also see examples like the
> following:
>
> rule replicated_ruleset { ruleset 0 type replicated min_size 1 max_size 10
> step take default
> step choose firstn 2 type rack step chooseleaf firstn 0 type osd step emit
> }
>

So this rule is saying "select 2 racks, and within each selected rack,
choose N leaf nodes". That's also not quite what you'd want.

If all of your components are new enough, you can do the overselection hack:

rule replicated_ruleset {
ruleset 0
type replicated
min_size 1
max_size 10
step take default
        step choose firstn 2 type rack
step chooseleaf firstn 2 type osd
step emit
}

That selects two racks (ie, both) and then chooses 2 OSDs within each rack.
But if you're only asking for three copies, it'll truncate the last OSD off
the list; and because it's selecting racks in a different order each time
you'll get a good distribution across racks.
-Greg


> As you might have noticed, I'm a little confused, so any assistance is
> greatly appreciated. And just to clarify once more, I want to make sure
> that it stores at least one copy in each rack. Advice on getting more
> granular is welcome as well however, as I there are pools with both 2x and
> 3x replication setup.
>
> Cheers,
> -
> Stephen Mercier | Sr. Systems Architect
> Attainia Capital Planning Solutions (ACPS)
> O: (650)241-0567, 727 | TF: (866)288-2464, 727
> stephen.merc...@attainia.com | www.attainia.com
>
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to