On Fri, Nov 23, 2018 at 11:01 AM ST Wong (ITSC) <s...@itsc.cuhk.edu.hk> wrote:
> Hi all, > > > We've 8 osd hosts, 4 in room 1 and 4 in room2. > > A pool with size = 3 using following crush map is created, to cater for > room failure. > > > rule multiroom { > id 0 > type replicated > min_size 2 > max_size 4 > step take default > step choose firstn 2 type room > step chooseleaf firstn 2 type host > step emit > } > > > > We're expecting: > > 1.for each object, there are always 2 replicas in one room and 1 replica > in other room making size=3. But we can't control which room has 1 or 2 > replicas. > Right. > > 2.in case an osd host fails, ceph will assign remaining osds to the same > PG to hold replicas on the failed osd host. Selection is based on crush > rule of the pool, thus maintaining the same failure domain - won't make all > replicas in the same room. > Yes, if a host fails the copies it held will be replaced by new copies in the same room. > > 3.in case of entire room with 1 replica fails, the pool will remain > degraded but won't do any replica relocation. > Right. > > 4. in case of entire room with 2 replicas fails, ceph will make use of > osds in the surviving room and making 2 replicas. Pool will not be > writeable before all objects are made 2 copies (unless we make pool > size=4?). Then when recovery is complete, pool will remain in degraded > state until the failed room recover. > Hmm, I'm actually not sure if this will work out — because CRUSH is hierarchical, it will keep trying to select hosts from the dead room and will fill out the location vector's first two spots with -1. It could be that Ceph will skip all those "nonexistent" entries and just pick the two copies from slots 3 and 4, but it might not. You should test this carefully and report back! -Greg > > Is our understanding correct? Thanks a lot. > Will do some simulation later to verify. > > Regards, > /stwong > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com