On Mon, Oct 29, 2018 at 7:43 PM David Turner <drakonst...@gmail.com> wrote:
> min_size should be at least k+1 for EC. There are times to use k for > emergencies like you had. I would suggest seeing it back to 3 once your > back to healthy. > > As far as why you needed to reduce min_size, my guess would be that > recovery would have happened as long as k copies were up. Were the PG's > refusing to backfill or just hang backfilled yet? > Recovery on EC pools requires min_size rather than k shards at this time. There were reasons; they weren't great. We're trying to get a fix tested and merged at https://github.com/ceph/ceph/pull/17619 -Greg > > > On Mon, Oct 29, 2018, 9:24 PM Chad W Seys <cws...@physics.wisc.edu> wrote: > >> Hi all, >> Recently our cluster lost a drive and a node (3 drives) at the same >> time. Our erasure coded pools are all k2m2, so if all is working >> correctly no data is lost. >> However, there were 4 PGs that stayed "incomplete" until I finally >> took the suggestion in 'ceph health detail' to reduce min_size . (Thanks >> for the hint!) I'm not sure what it was (likely 3), but setting it to 2 >> caused all PGs to become active (though degraded) and the cluster is on >> path to recovering fully. >> >> In replicated pools, would not ceph create replicas without the need >> to reduce min_size? It seems odd to not recover automatically if >> possible. Could someone explain what was going on there? >> >> Also, how to decide what min_size should be? >> >> Thanks! >> Chad. >> _______________________________________________ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com