So, is it okay to say that compared to the 'firstn' mode, the 'indep' mode may have the least impact on a cluster in an event of OSD failure? Could I use 'indep' for replica pool as well?
Thank you! Regards, Cody On Wed, Aug 22, 2018 at 7:12 PM Gregory Farnum <gfar...@redhat.com> wrote: > > On Wed, Aug 22, 2018 at 12:56 AM Konstantin Shalygin <k0...@k0ste.ru> wrote: >> >> > Hi everyone, >> > >> > I read an earlier thread [1] that made a good explanation on the 'step >> > choose|chooseleaf' option. Could someone further help me to understand >> > the 'firstn|indep' part? Also, what is the relationship between 'step >> > take' and 'step choose|chooseleaf' when it comes to define a failure >> > domain? >> > >> > Thank you very much. >> >> >> This documented on CRUSH Map Rules [1] >> >> >> [1] >> http://docs.ceph.com/docs/master/rados/operations/crush-map-edits/#crush-map-rules >> > > But that doesn't seem to really discuss it, and I don't see it elsewhere in > our docs either. So: > > "indep" and "firstn" are two different strategies for selecting items > (mostly, OSDs) in a CRUSH hierarchy. If you're storing EC data you want to > use indep; if you're storing replicated data you want to use firstn. > > The reason has to do with how they behave when a previously-selected devices > fails. Let's say you have a PG stored on OSDs 1, 2, 3, 4, 5. Then 3 goes down. > With the "firstn" mode, CRUSH simply adjusts its calculation in a way that it > selects 1 and 2, then selects 3 but discovers it's down, so it retries and > selects 4 and 5, and then goes on to select a new OSD 6. So the final CRUSH > mapping change is > 1, 2, 3, 4, 5 -> 1, 2, 4, 5, 6. > > But if you're storing an EC pool, that means you just changed the data mapped > to OSDs 4, 5, and 6! That's terrible! So the "indep" mode attempts to not do > that. (It still *might* conflict, but the odds are much lower). You can > instead expect it, when it selects the failed 3, to try again and pick out 6, > for a final transformation of: > 1, 2, 3, 4, 5 -> 1, 2, 6, 4, 5 > -Greg > >> >> >> >> k >> >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com _______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com