I am testing erasure code pools and doing a rados test write to try
fault tolerace.
I have 3 Nodes with 1 OSD each, K=2 M=1.

While performing the write (rados bench -p replicate 100 write), I stop
one of the OSDs daemons (example osd.0), simulating a node fail, and
then the hole write stops and I can't write any data anymore.

    1      16        28        12   46.8121        48     1.01548   
0.616034
    2      16        40        24   47.3907        48     1.04219   
0.923728
    3      16        52        36   47.5889        48    0.593145     
1.0038
    4      16        68        52   51.6633        64     1.39638    
1.08098
    5      16        74        58    46.158        24     1.02699    
1.10172
    6      16        83        67   44.4711        36     3.01542    
1.18012
    7      16        95        79   44.9722        48    0.776493    
1.24003
    8      16        95        79   39.3681         0           -    
1.24003
    9      16        95        79   35.0061         0           -    
1.24003
   10      16        95        79   31.5144         0           -    
1.24003
   11      16        95        79   28.6561         0           -    
1.24003
   12      16        95        79   26.2732         0           -    
1.24003

Its pretty clear where the OSD failed

On the other hand, using a replicated pool, the client (rados test)
doesnt even notice the OSD fail, which is awesome.

Is this a normal behaviour on EC pools?
------------------------------------------------------------------------
*Jorge Pinilla López*
jorp...@unizar.es
Estudiante de ingenieria informática
Becario del area de sistemas (SICUZ)
Universidad de Zaragoza
PGP-KeyID: A34331932EBC715A
<http://pgp.rediris.es:11371/pks/lookup?op=get&search=0xA34331932EBC715A>
------------------------------------------------------------------------
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to