> The CRUSH rule min_size is a completely different thing from the pool 
> min_size. If you set the pool min_size to 2 I *think* it will do what you 
> expect.
>> If you set min_size 2 before taking the OSDs down, that does seem odd.

Good to know, I got confused by the same names. I will try to set the correct 
min_size and see what happens. What did you mean when you said that it seems odd
that I set the (correct) min_size = 2 before taking the OSDs down, isn’t that 
the way I would do it now?

> But in general running with min_size == k is not a wise way to run the 
> cluster as you don't have any redundancy in the case of losses. :)

I totally agree. I am trying to understand erasure coding in Ceph in depth and 
it was kind of strange to have 2 OSDs left but not getting my data back. The 
way it
seems now it was only a configuration issue.

- Jonas

> Am 07.06.2017 um 22:02 schrieb Gregory Farnum <gfar...@redhat.com>:
> 
> 
> 
> On Wed, Jun 7, 2017 at 12:59 PM Jonas Jaszkowic 
> <jonasjaszko...@googlemail.com <mailto:jonasjaszko...@googlemail.com>> wrote:
>> If you set min_size 2 before taking the OSDs down, that does seem odd.
> 
> I think I don’t get the exact concept of min_size in the crush Crush ruleset. 
> The documentation 
> (http://docs.ceph.com/docs/master/rados/operations/crush-map/ 
> <http://docs.ceph.com/docs/master/rados/operations/crush-map/>) states:
> 
> min_size
> Description:  If a pool makes fewer replicas than this number, CRUSH will NOT 
> select this rule.
> Type:         Integer
> Purpose:              A component of the rule mask.
> Required:     Yes
> Default:              1
> 
> Assuming that I want my scenario to work (5 OSDs, 2+3 EC Pool, 3 OSDs down, 
> still reading my data), how do
> I have to configure my pool exactly to work? Or is this simply not possible 
> at this point? 
> 
> The CRUSH rule min_size is a completely different thing from the pool 
> min_size. If you set the pool min_size to 2 I *think* it will do what you 
> expect.
> 
> But in general running with min_size == k is not a wise way to run the 
> cluster as you don't have any redundancy in the case of losses. :)
>  
> I just want to be sure that I have no errors in my configuration.
> 
>> Yeah, we just don't have a way of serving reads without serving writes at 
>> the moment. It's a limit of the architecture.
> 
> Thank you, this is good to know, particularly because I didn’t find anything 
> about it on the documentation.
> 
> - Jonas
> 
> 
>> Am 07.06.2017 um 21:40 schrieb Gregory Farnum <gfar...@redhat.com 
>> <mailto:gfar...@redhat.com>>:
>> 
>> 
>> 
>> On Wed, Jun 7, 2017 at 12:30 PM Jonas Jaszkowic 
>> <jonasjaszko...@googlemail.com <mailto:jonasjaszko...@googlemail.com>> wrote:
>> 
>>> Am 07.06.2017 um 20:29 schrieb Gregory Farnum <gfar...@redhat.com 
>>> <mailto:gfar...@redhat.com>>:
>>> 
>>> We prevent PGs from going active (and serving writes or reads) when they 
>>> have less than "min_size" OSDs participating. This is generally set so that 
>>> we have enough redundancy to recover from at least one OSD failing.
>> 
>> Do you mean the min_size value from the crush rule? I set min_size = 2, so a 
>> 2+3 EC pool with 3 killed OSDs still has the minimum amount of 2 OSDs and 
>> should be able
>> to fully recover data, right?
>> 
>> If you set min_size 2 before taking the OSDs down, that does seem odd.
>>  
>> 
>>> In your case, you have 2 OSDs and the failure of either one of them results 
>>> in the loss of all written data. So we don't let you go active as it's not 
>>> safe.
>> 
>> 
>> I get that it makes no sense to serve writes at this point because we cannot 
>> provide the desired redundancy, but how is preventing me from going active 
>> more safe than just serving reads? I think what bugs me is that by 
>> definition of the used erasure code, we should be able to loose 3 OSDs and 
>> still get our data back - which is not the case in this scenario because our 
>> cluster refuses to go active.
>> 
>> Yeah, we just don't have a way of serving reads without serving writes at 
>> the moment. It's a limit of the architecture.
>>  
>> -Greg
>> PS: please keep this on the list. It spreads the information and archives it 
>> for future reference by others. :)
>> 
> 

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to