Re: [ceph-users] Write freeze when writing to rbd image and rebooting one of the nodes

Vasiliy Angapov Fri, 15 May 2015 03:54:49 -0700

Hi, Robert,

Here is my crush map.


# begin crush map
tunable choose_local_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable chooseleaf_vary_r 1

# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 osd.5
device 6 osd.6
device 7 osd.7
device 8 osd.8
device 9 osd.9

# types
type 0 osd
type 1 host
type 2 zone
type 3 storage_group
type 4 root

# buckets
host  controller_performance_zone_one {
        id -1           # do not change unnecessarily
        alg straw
        hash 0  # rjenkins1
        item osd.2 weight 1.000
}
host  controller_capacity_zone_one {
        id -2           # do not change unnecessarily
        alg straw
        hash 0  # rjenkins1
        item osd.0 weight 1.000
        item osd.1 weight 1.000
}
host  compute2_performance_zone_one {
        id -3           # do not change unnecessarily
        alg straw
        hash 0  # rjenkins1
        item osd.5 weight 1.000
}
host  compute2_capacity_zone_one {
        id -4           # do not change unnecessarily
        alg straw
        hash 0  # rjenkins1
        item osd.3 weight 1.000
        item osd.4 weight 1.000
}
host  compute3_performance_zone_one {
        id -5           # do not change unnecessarily
        alg straw
        hash 0  # rjenkins1
}
host  compute3_capacity_zone_one {
        id -6           # do not change unnecessarily
        alg straw
        hash 0  # rjenkins1
        item osd.6 weight 1.000
        item osd.7 weight 1.000
}
zone zone_one_performance {
        id -7           # do not change unnecessarily
        alg straw
        hash 0  # rjenkins1
        item  controller_performance_zone_one weight 1.000
        item  compute2_performance_zone_one weight 1.000
        item  compute3_performance_zone_one weight 0.100
}
host  compute4_capacity_zone_one {
        id -12          # do not change unnecessarily
        alg straw
        hash 0  # rjenkins1
        item osd.8 weight 1.000
        item osd.9 weight 1.000
}
zone zone_one_capacity {
        id -8           # do not change unnecessarily
        alg straw
        hash 0  # rjenkins1
        item  controller_capacity_zone_one weight 2.000
        item  compute2_capacity_zone_one weight 2.000
        item  compute3_capacity_zone_one weight 2.000
        item  compute4_capacity_zone_one weight 2.000
}
storage_group performance {
        id -9           # do not change unnecessarily
        alg straw
        hash 0  # rjenkins1
        item zone_one_performance weight 2.100
}
storage_group capacity {
        id -10          # do not change unnecessarily
        alg straw
        hash 0  # rjenkins1
        item zone_one_capacity weight 8.000
}
root vsm {
        id -11          # do not change unnecessarily
        alg straw
        hash 0  # rjenkins1
        item performance weight 2.100
        item capacity weight 8.000
}

# rules
rule capacity {
        ruleset 0
        type replicated
        min_size 0
        max_size 10
        step take capacity
        step chooseleaf firstn 0 type host
        step emit
}
rule performance {
        ruleset 1
        type replicated
        min_size 0
        max_size 10
        step take performance
        step chooseleaf firstn 0 type host
        step emit
}

# end crush map

ID  WEIGHT   TYPE NAME                                           UP/DOWN
REWEIGHT PRIMARY-AFFINITY
-11 10.09999 root vsm
 -9  2.09999     storage_group performance
 -7  2.09999         zone zone_one_performance
 -1  1.00000             host controller_performance_zone_one
  2  1.00000                 osd.2                                    up
 1.00000          1.00000
 -3  1.00000             host compute2_performance_zone_one
  5  1.00000                 osd.5                                    up
 1.00000          1.00000
 -5  0.09999             host compute3_performance_zone_one
-10  8.00000     storage_group capacity
 -8  8.00000         zone zone_one_capacity
 -2  2.00000             host controller_capacity_zone_one
  0  1.00000                 osd.0                                    up
 1.00000          1.00000
  1  1.00000                 osd.1                                    up
 1.00000          1.00000
 -4  2.00000             host compute2_capacity_zone_one
  3  1.00000                 osd.3                                    up
 1.00000          1.00000
  4  1.00000                 osd.4                                    up
 1.00000          1.00000
 -6  2.00000             host compute3_capacity_zone_one
  6  1.00000                 osd.6                                    up
 1.00000          1.00000
  7  1.00000                 osd.7                                    up
 1.00000          1.00000
-12  2.00000             host compute4_capacity_zone_one
  8  1.00000                 osd.8                                    up
 1.00000          1.00000
  9  1.00000                 osd.9                                    up
 1.00000          1.00000

And here are the stats of the pool i was using for tests:

root@iclcompute4:# ceph osd pool get Gold crush_ruleset
crush_ruleset: 0
root@iclcompute4:# ceph osd pool get Gold size
size: 3
root@iclcompute4:# ceph osd pool get Gold min_size
min_size: 1

IO freeze happens whetger when I add or remove host with 2 OSD. I just did
it with standard manual Ceph procedure at
http://docs.ceph.com/docs/master/rados/operations/add-or-rm-osds/ to be
sure that scripting mistakes are not involved. So freeze lasts until
cluster says OK, then resumes.

Regards, Vasily.



On Thu, May 14, 2015 at 6:44 PM, Robert LeBlanc <rob...@leblancnet.us>
wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> Can you provide the output of the CRUSH map and a copy of the script that you 
> are using to add the OSDs? Can you also provide the pool size and pool 
> min_size?
> -----BEGIN PGP SIGNATURE-----
> Version: Mailvelope v0.13.1
> Comment: https://www.mailvelope.com
>
> wsFcBAEBCAAQBQJVVMLvCRDmVDuy+mK58QAAHVIQALIZ8aOWE5P8DkRe+8pz
> XS+rMdA17nPUd2mX6PIqhjBxetrUhIjQUho8HSIswT9JVkjVSIj+QHs5CI1C
> 6ArWIPt/U8L78d1hI8NuH/vWwWydYfV32n2L2LExIgUpFAbJA81AnjjDFLvo
> T63KLitQ1wz8lyhAWXp4ze15CgAv1u9VbJhazeeWunxZxd8eSGuUS8RTdhLD
> sD0pSQnVT4W04TSKYfvbUlpqm68wGY+MApnuQXdpC0jBLcDz0OSu1P+OQC03
> 0vBCERY1er/rSskJ6TRrQGLzXAc/vc3HbPMvegIhp2voeXgONdO5P/qLfSfD
> ZwVUoi6EfFe+na3S4rEjOeBU+v2P00komVEcvjOJDQb3IVcE23iVJOezk3p+
> AgJqOz9VLdGvdmZTZnR08PKPZEja80QzrSklRW5f8JyjKlbE8tB5lBoM5mKo
> oRcBSDbGSKvXInqygQ3XLdxULHaXbNqNPj+JvPbmfkTU6Iq6pXqcBdUSqG0o
> /5Rx16+2Rouz4f8uu5irmDjz0ivKL6QCIzBwZbBTdLIwqhf9vCl1ACDWq4U3
> DMorcafZbMArdOqlkVhQJiMioZEQ8U/ThY2bInkNdhii/2A35CToyOfMKyfq
> FLAK5lCiM6gRfCkEBPTwkDR6GNAfgY7khz34adsBRlZPB6a3MeucAGtTjyWt
> AJIV
> =bcYd
> -----END PGP SIGNATURE-----
>
>
> ----------------
> Robert LeBlanc
> GPG Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
>
> On Thu, May 14, 2015 at 6:33 AM, Vasiliy Angapov <anga...@gmail.com>
> wrote:
>
>> Thanks, Robert, for sharing so many experience! I feel like I don't
>> deserve it :)
>>
>> I have another but very same situation which I don't understand.
>> Last time i tried to hard kill OSD daemons.
>> This time i add a new node with 2 OSDs to my cluster and also monitor the
>> IO. I wrote a script which adds a node with OSDs fully automatically. And
>> seems like when I start the script - an IO is also blocked until the
>> cluster shows HEALTH_OK which takes quite an amount of time. After Ceph
>> status is OK - copying resumes.
>>
>> What should I tune this time to avoid long IO interuption?
>>
>> Thanks in advance again :)
>>
>
>

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Write freeze when writing to rbd image and rebooting one of the nodes

Reply via email to