Hi
The "ec unable to recover when below min size" thing has very recently
been fixed for octopus.
See https://tracker.ceph.com/issues/18749 and
https://github.com/ceph/ceph/pull/17619
Docs has been updated with a section on this issue
http://docs.ceph.com/docs/master/rados/operations/erasure-code/#erasure-coded-pool-recover
[2]
/Torben
On 05.07.2019 11:50, Paul Emmerich wrote:
> * There are virtually no use cases for ec pools with m=1, this is a bad
> configuration as you can't have both availability and durability
>
> * Due to weird internal restrictions ec pools below their min size can't
> recover, you'll probably have to reduce min_size temporarily to recover it
>
> * Depending on your version it might be necessary to restart some of the OSDs
> due to a bug (fixed by now) that caused it to mark some objects as degraded
> if you remove or restart an OSD while you have remapped objects
> * run "ceph osd safe-to-destroy X" to check if it's safe to destroy a given
> OSD
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io [1]
> Tel: +49 89 1896585 90
>
> On Fri, Jul 5, 2019 at 1:17 AM Kyle <arad...@tma-0.net> wrote:
>
>> Hello,
>>
>> I'm working with a small ceph cluster (about 10TB, 7-9 OSDs, all Bluestore
>> on
>> lvm) and recently ran into a problem with 17 pgs marked as incomplete after
>> adding/removing OSDs.
>>
>> Here's the sequence of events:
>> 1. 7 osds in the cluster, health is OK, all pgs are active+clean
>> 2. 3 new osds on a new host are added, lots of backfilling in progress
>> 3. osd 6 needs to be removed, so we do "ceph osd crush reweight osd.6 0"
>> 4. after a few hours we see "min osd.6 with 0 pgs" from "ceph osd
>> utilization"
>> 5. ceph osd out 6
>> 6. systemctl stop ceph-osd@6
>> 7. the drive backing osd 6 is pulled and wiped
>> 8. backfilling has now finished all pgs are active+clean except for 17
>> incomplete pgs
>>
>> From reading the docs, it sounds like there has been unrecoverable data loss
>> in those 17 pgs. That raises some questions for me:
>>
>> Was "ceph osd utilization" only showing a goal of 0 pgs allocated instead of
>> the current actual allocation?
>>
>> Why is there data loss from a single osd being removed? Shouldn't that be
>> recoverable?
>> All pools in the cluster are either replicated 3 or erasure-coded k=2,m=1
>> with
>> default "host" failure domain. They shouldn't suffer data loss with a single
>> osd being removed even if there were no reweighting beforehand. Does the
>> backfilling temporarily reduce data durability in some way?
>>
>> Is there a way to see which pgs actually have data on a given osd?
>>
>> I attached an example of one of the incomplete pgs.
>>
>> Thanks for any help,
>>
>> Kyle_______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> _______________________________________________
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Links:
------
[1] http://www.croit.io
[2]
http://docs.ceph.com/docs/master/rados/operations/erasure-code/#erasure-coded-pool-recovery
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com