On 04/20/2015 11:02 AM, Robert LeBlanc wrote:
> We have a similar issue, but we wanted three copies across two racks. Turns 
> out, that we increased size to 4 and left min_size at 2. We didn't want to 
> risk having less than two copies and if we only had thee copies, losing a 
> rack would block I/O. Once we expand to a third rack, we will adjust our rule 
> and go to size 3. Searching the mailing list and docs proved difficult, so 
> I'll include my rule so that you can use it as a basis. You should be able to 
> just change rack to host and host to osd. If you want to keep only three 
> copies, the "extra" OSD chosen just won't be used as Gregory mentions. 
> Technically this rule should have "max_size 4", but I won't set a pool over 4 
> copies so I didn't change it here.
> 
> If anyone has a better way of writing this rule (or one that would work for 
> both a two rack and 3+ rack configuration as mentioned above), I'd be open to 
> it. This is the first rule that I've really wrote on my own.
> 
> rule replicated_ruleset {
>         ruleset 0
>         type replicated
>         min_size 1
>         max_size 10
>         step take default
>         step choose firstn 2 type rack
>         step chooseleaf firstn 2 type host
>         step emit
> }

Thank you Robert. Your example was very helpful. I didn't realize you could 
nest the choose and chooseleaf steps together. I thought chooseleaf effectively 
handled that for you already. This makes a bit more sense now.

My rule looks like this now:
rule host_rule {
        ruleset 2
        type replicated
        min_size 2
        max_size 3
        step take default
        step choose firstn 2 type host
        step chooseleaf firstn 2 type osd
        step emit
}

And the cluster is reporting the pool as clean, finally. If I understand 
correctly, we will now potentially have as many as 4 replicas of an object in 
the pool, 2 on each host.

> On Mon, Apr 20, 2015 at 11:50 AM, Gregory Farnum <g...@gregs42.com 
> <mailto:g...@gregs42.com>> wrote:

>     It's actually pretty hacky: you configure your CRUSH rule to return
>     two OSDs from each host, but set your size to 3. You'll want to test
>     this carefully with your installed version to make sure that works,
>     though — older CRUSH implementations would crash if you did that. :(
> 
>     In slightly more detail, you'll need to change it so that instead of
>     using "chooseleaf" you "choose" 2 hosts, and then choose or chooseleaf
>     2 OSDs from each of those hosts. If you search the list archives for
>     CRUSH threads you'll find some other discussions about doing precisely
>     this, and I think the CRUSH documentation should cover the more
>     general bits of how the language works.
>     -Greg

Thank you Greg, I had trouble searching for discussions related to this. The 
Google was not being friendly, or I wasn't issuing a good query. My 
understanding of choose vs. chooseleaf and using multiple choose~ steps in a 
rule will send me back to the docs for the remainder of my day.

Thanks,

Colin




_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to