On Nov 18, 2014 4:48 PM, "Gregory Farnum" <g...@gregs42.com> wrote: > > On Tue, Nov 18, 2014 at 3:38 PM, Robert LeBlanc <rob...@leblancnet.us> wrote: > > I was going to submit this as a bug, but thought I would put it here for > > discussion first. I have a feeling that it could be behavior by design. > > > > ceph version 0.87 (c51c8f9d80fa4e0168aa52685b8de40e42758578) > > > > I'm using a cache pool and was playing around with the size and min_size on > > the pool to see the effects of replication. I set size/min_size to 1, then I > > ran "ceph osd pool set ssd size 3; ceph osd pool set ssd min_size 2". Client > > I/O immediately blocked as there was not 2 copies yet (as expected). > > However, after the degraded objects are cleared up, there are several PGs in > > the remapped+incomplete state and client I/O continues to be blocked even > > though all OSDs are up and healthy (even left overnight). If I set min_size > > back down to 1, the cluster recovers and client I/O continues. > > > > I expected that as long as there is one copy of the data, the cluster can > > copy that data to min_size and cluster operations resume. > > > > Where I think it could be by design is when min_size was already set to 2 > > and you lose enough OSDs fast enough to dip below that level. There could be > > the chance that the serving OSD could have bad data (but we wouldn't know > > that anyway at the moment). The bad data could then be replicated and the > > ability to recover any good data would be lost. > > > > However, if Ceph immediately replicated the sole OSD to get back to min_size > > then when the other(s) came back online, it could back fill and just destroy > > the extras. > > > > It seems that immediately replication to keep the cluster operational seems > > like a good thing overall. Am I missing something? > > This is sort of by design, but mostly an accident of many other > architecture choices. Sam is actually working now to enable PG > recovery when you have fewer than min_size copies available; I very > much doubt it will be backported to any existing LTS releases but it > ought to be in Hammer. > -Greg
Greg, thanks for the update. I'll refrain from submitting a bug request since it is already being worked on. For now we will make sure that we don't increase min_size until size has been increased and the objects have been completely replicated. Robert LeBlanc Sent from a mobile device please excuse any typos.
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com