Hi,
I'm wondering why slow requests are being reported mainly when the request
has been put into the queue for processing by its PG (queued_for_pg ,
http://docs.ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/#debugging-slow-request
).
Could it be due too low pg_num/pgp_num ?
It looks that slow requests are mainly addressed to
default.rgw.buckets.data (pool id 20) , volumes (pool id 3) and
default.rgw.buckets.index (pool id 14)
2018-01-31 12:06:55.899557 osd.59 osd.59 10.212.32.22:6806/4413 38 :
cluster [WRN] slow request 30.125793 seconds old, received at 2018-01-31
12:06:25.773675: osd_op(client.857003.0:126171692 3.a4fec1ad 3.a4fec1ad
(undecoded) ack+ondisk+write+known_if_redirected e5722) currently
queued_for_pg
Btw how can I get more human-friendly client information from log entry
like above ?
Current pg_num/pgp_num
pool 3 'volumes' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 1024 pgp_num 1024 last_change 4502 flags hashpspool
stripe_width 0 application rbd
removed_snaps [1~3]
pool 14 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule
0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool
stripe_width 0 application rgw
pool 20 'default.rgw.buckets.data' erasure size 9 min_size 6 crush_rule 1
object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 4502 flags
hashpspool stripe_width 4224 application rgw
Usage
GLOBAL:
SIZE AVAIL RAW USED %RAW USED OBJECTS
385T 144T 241T 62.54 31023k
POOLS:
NAME ID QUOTA OBJECTS QUOTA BYTES
USED %USED MAX AVAIL OBJECTS DIRTY READ
WRITE RAW USED
volumes 3 N/A N/A
40351G 70.91 16557G 10352314 10109k 2130M
2520M 118T
default.rgw.buckets.index 14 N/A N/A
0 0 16557G 205 205 160M
27945k 0
default.rgw.buckets.data 20 N/A N/A
79190G 70.51 33115G 20865953 20376k 122M
113M 116T
# ceph osd pool ls detail
pool 0 'rbd' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins
pg_num 64 pgp_num 64 last_change 4502 flags hashpspool stripe_width 0
application rbd
pool 1 'vms' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins
pg_num 1024 pgp_num 1024 last_change 4502 flags hashpspool stripe_width 0
application rbd
pool 2 'images' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 512 pgp_num 512 last_change 5175 flags hashpspool
stripe_width 0 application rbd
removed_snaps [1~7,14~2]
pool 3 'volumes' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 1024 pgp_num 1024 last_change 4502 flags hashpspool
stripe_width 0 application rbd
removed_snaps [1~3]
pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool stripe_width
0 application rgw
pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool
stripe_width 0 application rgw
pool 6 'default.rgw.data.root' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool
stripe_width 0 application rgw
pool 7 'default.rgw.gc' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool
stripe_width 0 application rgw
pool 8 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool
stripe_width 0 application rgw
pool 9 'default.rgw.users.uid' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool
stripe_width 0 application rgw
pool 10 'default.rgw.usage' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool
stripe_width 0 application rgw
pool 11 'default.rgw.users.email' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 owner
18446744073709551615 flags hashpspool stripe_width 0 application rgw
pool 12 'default.rgw.users.keys' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 owner
18446744073709551615 flags hashpspool stripe_width 0 application rgw
pool 13 'default.rgw.users.swift' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool
stripe_width 0 application rgw
pool 14 'default.rgw.buckets.index' replicated size 3 min_size 2 crush_rule
0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool
stripe_width 0 application rgw
pool 15 'default.rgw.buckets.data.old' replicated size 3 min_size 2
crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 4502
flags hashpspool stripe_width 0 application rgw
pool 16 'default.rgw.buckets.non-ec' replicated size 3 min_size 2
crush_rule 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags
hashpspool stripe_width 0 application rgw
pool 17 'default.rgw.buckets.extra' replicated size 3 min_size 2 crush_rule
0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool
stripe_width 0 application rgw
pool 18 '.rgw.buckets.extra' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 4502 flags hashpspool
stripe_width 0 application rgw
pool 20 'default.rgw.buckets.data' erasure size 9 min_size 6 crush_rule 1
object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 4502 flags
hashpspool stripe_width 4224 application rgw
pool 21 'benchmark_replicated' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 4550 flags
hashpspool stripe_width 0 application rbd
removed_snaps [1~3]
pool 22 'benchmark_erasure_coded' erasure size 9 min_size 7 crush_rule 1
object_hash rjenkins pg_num 32 pgp_num 32 last_change 4552 flags hashpspool
stripe_width 24576 application rbd
removed_snaps [1~3]
# ceph df detail
GLOBAL:
SIZE AVAIL RAW USED %RAW USED OBJECTS
385T 144T 241T 62.54 31023k
POOLS:
NAME ID QUOTA OBJECTS QUOTA BYTES
USED %USED MAX AVAIL OBJECTS DIRTY READ
WRITE RAW USED
rbd 0 N/A N/A
0 0 16557G 0 0 1
134k 0
vms 1 N/A N/A
0 0 16557G 0 0 0
0 0
images 2 N/A N/A
7659M 0.05 16557G 1022 1022 51247
5668 22977M
volumes 3 N/A N/A
40351G 70.91 16557G 10352314 10109k 2130M
2520M 118T
.rgw.root 4 N/A N/A
1588 0 16557G 4 4 90
4 4764
default.rgw.control 5 N/A N/A
0 0 16557G 8 8 0
0 0
default.rgw.data.root 6 N/A N/A
93943 0 16557G 336 336 239k
6393 275k
default.rgw.gc 7 N/A N/A
0 0 16557G 32 32 1773M
5281k 0
default.rgw.log 8 N/A N/A
0 0 16557G 185 185 22404k
14936k 0
default.rgw.users.uid 9 N/A N/A
3815 0 16557G 15 15 187k
53303 11445
default.rgw.usage 10 N/A N/A
0 0 16557G 7 7 278k
556k 0
default.rgw.users.email 11 N/A N/A
58 0 16557G 3 3 0
3 174
default.rgw.users.keys 12 N/A N/A
177 0 16557G 10 10 262
22 531
default.rgw.users.swift 13 N/A N/A
40 0 16557G 3 3 0
3 120
default.rgw.buckets.index 14 N/A N/A
0 0 16557G 205 205 160M
27945k 0
default.rgw.buckets.data.old 15 N/A N/A
668G 3.88 16557G 180867 176k 707k
2318k 2004G
default.rgw.buckets.non-ec 16 N/A N/A
0 0 16557G 114 114 17960
12024 0
default.rgw.buckets.extra 17 N/A N/A
0 0 16557G 0 0 0
0 0
.rgw.buckets.extra 18 N/A N/A
0 0 16557G 0 0 0
0 0
default.rgw.buckets.data 20 N/A N/A
79190G 70.51 33115G 20865953 20376k 122M
113M 116T
benchmark_replicated 21 N/A N/A
1415G 7.88 16557G 363800 355k 1338k
1251k 4247G
benchmark_erasure_coded 22 N/A N/A
11057M 0.03 33115G 2761 2761 398
5520 16586M
Thanks
Jakub
_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com