Re: [ceph-users] Requests blocked in degraded erasure coded pool

Jonas Jaszkowic Wed, 07 Jun 2017 12:59:59 -0700

> If you set min_size 2 before taking the OSDs down, that does seem odd.

I think I don’t get the exact concept of min_size in the crush Crush ruleset. 
The documentation (http://docs.ceph.com/docs/master/rados/operations/crush-map/ 
<http://docs.ceph.com/docs/master/rados/operations/crush-map/>) states:

min_size
Description:    If a pool makes fewer replicas than this number, CRUSH will NOT 
select this rule.
Type:           Integer
Purpose:                A component of the rule mask.
Required:       Yes
Default:                1

Assuming that I want my scenario to work (5 OSDs, 2+3 EC Pool, 3 OSDs down, 
still reading my data), how do
I have to configure my pool exactly to work? Or is this simply not possible at 
this point? 
I just want to be sure that I have no errors in my configuration.

> Yeah, we just don't have a way of serving reads without serving writes at the 
> moment. It's a limit of the architecture.

Thank you, this is good to know, particularly because I didn’t find anything 
about it on the documentation.

- Jonas

> Am 07.06.2017 um 21:40 schrieb Gregory Farnum <gfar...@redhat.com>:
> 
> 
> 
> On Wed, Jun 7, 2017 at 12:30 PM Jonas Jaszkowic 
> <jonasjaszko...@googlemail.com <mailto:jonasjaszko...@googlemail.com>> wrote:
> 
>> Am 07.06.2017 um 20:29 schrieb Gregory Farnum <gfar...@redhat.com 
>> <mailto:gfar...@redhat.com>>:
>> 
>> We prevent PGs from going active (and serving writes or reads) when they 
>> have less than "min_size" OSDs participating. This is generally set so that 
>> we have enough redundancy to recover from at least one OSD failing.
> 
> Do you mean the min_size value from the crush rule? I set min_size = 2, so a 
> 2+3 EC pool with 3 killed OSDs still has the minimum amount of 2 OSDs and 
> should be able
> to fully recover data, right?
> 
> If you set min_size 2 before taking the OSDs down, that does seem odd.
>  
> 
>> In your case, you have 2 OSDs and the failure of either one of them results 
>> in the loss of all written data. So we don't let you go active as it's not 
>> safe.
> 
> 
> I get that it makes no sense to serve writes at this point because we cannot 
> provide the desired redundancy, but how is preventing me from going active 
> more safe than just serving reads? I think what bugs me is that by definition 
> of the used erasure code, we should be able to loose 3 OSDs and still get our 
> data back - which is not the case in this scenario because our cluster 
> refuses to go active.
> 
> Yeah, we just don't have a way of serving reads without serving writes at the 
> moment. It's a limit of the architecture.
>  
> -Greg
> PS: please keep this on the list. It spreads the information and archives it 
> for future reference by others. :)
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Requests blocked in degraded erasure coded pool

Reply via email to