Re: [ceph-users] Requests blocked in degraded erasure coded pool

Gregory Farnum Wed, 07 Jun 2017 13:03:07 -0700

On Wed, Jun 7, 2017 at 12:59 PM Jonas Jaszkowic <
jonasjaszko...@googlemail.com> wrote:


> If you set min_size 2 before taking the OSDs down, that does seem odd.
>
>
> I think I don’t get the exact concept of min_size in the crush Crush
> ruleset. The documentation (
> http://docs.ceph.com/docs/master/rados/operations/crush-map/) states:
>
> min_size
> Description: If a pool makes fewer replicas than this number, CRUSH
> will NOT select this rule.
> Type: Integer
> Purpose: A component of the rule mask.
> Required: Yes
> Default: 1
>
> Assuming that I want my scenario to work (5 OSDs, 2+3 EC Pool, 3 OSDs
> down, still reading my data), how do
> I have to configure my pool exactly to work? Or is this simply not
> possible at this point?
>

The CRUSH rule min_size is a completely different thing from the pool
min_size. If you set the pool min_size to 2 I *think* it will do what you
expect.

But in general running with min_size == k is not a wise way to run the
cluster as you don't have any redundancy in the case of losses. :)


> I just want to be sure that I have no errors in my configuration.
>
> Yeah, we just don't have a way of serving reads without serving writes at
> the moment. It's a limit of the architecture.
>
>
> Thank you, this is good to know, particularly because I didn’t find
> anything about it on the documentation.
>
> - Jonas
>
>
> Am 07.06.2017 um 21:40 schrieb Gregory Farnum <gfar...@redhat.com>:
>
>
>
> On Wed, Jun 7, 2017 at 12:30 PM Jonas Jaszkowic <
> jonasjaszko...@googlemail.com> wrote:
>
>>
>> Am 07.06.2017 um 20:29 schrieb Gregory Farnum <gfar...@redhat.com>:
>>
>> We prevent PGs from going active (and serving writes or reads) when they
>> have less than "min_size" OSDs participating. This is generally set so that
>> we have enough redundancy to recover from at least one OSD failing.
>>
>>
>> Do you mean the min_size value from the crush rule? I set min_size = 2,
>> so a 2+3 EC pool with 3 killed OSDs still has the minimum amount of 2 OSDs
>> and should be able
>> to fully recover data, right?
>>
>
> If you set min_size 2 before taking the OSDs down, that does seem odd.
>
>
>>
>> In your case, you have 2 OSDs and the failure of either one of them
>> results in the loss of all written data. So we don't let you go active as
>> it's not safe.
>>
>>
>> I get that it makes no sense to serve *writes* at this point because we
>> cannot provide the desired redundancy, but how is preventing me from going
>> active more safe than just serving *reads*? I think what bugs me is that
>> by definition of the used erasure code, we should be able to loose 3 OSDs
>> and still get our data back - which is not the case in this scenario
>> because our cluster refuses to go active.
>>
>
> Yeah, we just don't have a way of serving reads without serving writes at
> the moment. It's a limit of the architecture.
>
> -Greg
> PS: please keep this on the list. It spreads the information and archives
> it for future reference by others. :)
>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Requests blocked in degraded erasure coded pool

Reply via email to