On 04/20/2015 11:02 AM, Robert LeBlanc wrote: > We have a similar issue, but we wanted three copies across two racks. Turns > out, that we increased size to 4 and left min_size at 2. We didn't want to > risk having less than two copies and if we only had thee copies, losing a > rack would block I/O. Once we expand to a third rack, we will adjust our rule > and go to size 3. Searching the mailing list and docs proved difficult, so > I'll include my rule so that you can use it as a basis. You should be able to > just change rack to host and host to osd. If you want to keep only three > copies, the "extra" OSD chosen just won't be used as Gregory mentions. > Technically this rule should have "max_size 4", but I won't set a pool over 4 > copies so I didn't change it here. > > If anyone has a better way of writing this rule (or one that would work for > both a two rack and 3+ rack configuration as mentioned above), I'd be open to > it. This is the first rule that I've really wrote on my own. > > rule replicated_ruleset { > ruleset 0 > type replicated > min_size 1 > max_size 10 > step take default > step choose firstn 2 type rack > step chooseleaf firstn 2 type host > step emit > }
Thank you Robert. Your example was very helpful. I didn't realize you could nest the choose and chooseleaf steps together. I thought chooseleaf effectively handled that for you already. This makes a bit more sense now. My rule looks like this now: rule host_rule { ruleset 2 type replicated min_size 2 max_size 3 step take default step choose firstn 2 type host step chooseleaf firstn 2 type osd step emit } And the cluster is reporting the pool as clean, finally. If I understand correctly, we will now potentially have as many as 4 replicas of an object in the pool, 2 on each host. > On Mon, Apr 20, 2015 at 11:50 AM, Gregory Farnum <g...@gregs42.com > <mailto:g...@gregs42.com>> wrote: > It's actually pretty hacky: you configure your CRUSH rule to return > two OSDs from each host, but set your size to 3. You'll want to test > this carefully with your installed version to make sure that works, > though — older CRUSH implementations would crash if you did that. :( > > In slightly more detail, you'll need to change it so that instead of > using "chooseleaf" you "choose" 2 hosts, and then choose or chooseleaf > 2 OSDs from each of those hosts. If you search the list archives for > CRUSH threads you'll find some other discussions about doing precisely > this, and I think the CRUSH documentation should cover the more > general bits of how the language works. > -Greg Thank you Greg, I had trouble searching for discussions related to this. The Google was not being friendly, or I wasn't issuing a good query. My understanding of choose vs. chooseleaf and using multiple choose~ steps in a rule will send me back to the docs for the remainder of my day. Thanks, Colin _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com