On 14/01/2014 07:49, ZHOU Yuan wrote:> Hi Loic, thanks for the education!
> 
> I’m also trying to understand the new ‘indep’ mode. Is this new mode designed 
> for Ceph-EC only? It seems that all of the data in 3-copy system are 
> equivalent and this new algorithm should also work?
> 

In the best case scenario, using indep instead of firstn on replicated pools 
won't make a difference. However, if the crush mapper does not find the 
required amount of items, firstn will give ( for instance ) [1,2,4] instead of 
[1,2,3,4] and the replicated pool code will gracefully handle this. If using 
indep the result will be [1,2,CRUSH_ITEM_NONE,4] and will probably assert 
somewhere.

Here is an example from the test suite run when you make check :
https://github.com/ceph/ceph/blob/master/src/test/cli/crushtool/bad-mappings.t
where 2147483647 == CRUSH_ITEM_NONE

I don't know of an other reason preventing the use of indep for replicated 
pools.

Cheers

> 
> Sincerely, Yuan
> 
> 
> On Mon, Jan 13, 2014 at 7:37 AM, Loic Dachary <l...@dachary.org 
> <mailto:l...@dachary.org>> wrote:
> 
> 
> 
>     On 12/01/2014 15:55, Dietmar Maurer wrote:
>     > From the docs:
>     >
>     >
>     >
>     > step [choose|chooseleaf] [firstn|indep] <N> <bucket-type>
>     >
>     >
>     >
>     > What exactly is the difference between ‘firstn’ and ‘indep’?
>     >
>     Hi,
> 
>     For Ceph releases up to Emperor[1], firstn is used and I'm not aware of a 
> use case requiring indep. As part of the effort to implement erasure coded 
> pools, firstn[2] and indep[3] were separated in two functions. The firstn 
> method is best suited for replicated pools. The indep method tries to 
> minimize the position changes in case an OSD becomes unavailable. For 
> instance, if indep finds
> 
>       [1,2,3,4]
> 
>     and after a while 3 become unavailable, it is very likely to replace it 
> with
> 
>       [1,2,5,4]
> 
>     It matters to erasure coded pools because
> 
>       [4,5,2,1]
> 
>     (i.e. the same OSDs but in different positions), implies more I/O. 
> Another difference is that in the case of a mapping failure (i.e. unable to 
> find the required number of OSDs), firstn will return a short list ( for 
> instance [1,2,3] when 4 are required ) and indep will return a list with a 
> placeholder at the missing position ( for instance [1,2,CRUSH_ITEM_NONE,4] ).
> 
>     Cheers
> 
>     [1] implementation in releases up to Emperor 
> https://github.com/ceph/ceph/blob/v0.72/src/crush/mapper.c#L295
>     [2] firstn https://github.com/ceph/ceph/blob/v0.74/src/crush/mapper.c#L295
>     [3] indep https://github.com/ceph/ceph/blob/v0.74/src/crush/mapper.c#L459
> 
>     --
>     Loïc Dachary, Artisan Logiciel Libre
> 
> 
>     _______________________________________________
>     ceph-users mailing list
>     ceph-users@lists.ceph.com <mailto:ceph-users@lists.ceph.com>
>     http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 

-- 
Loïc Dachary, Artisan Logiciel Libre

Attachment: signature.asc
Description: OpenPGP digital signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to