Hi, we have problem with drastic performance slowing down on a cluster. We used radosgw with S3 protocol. Our configuration:
153 OSD SAS 1.2TB with journal on SSD disks (ratio 4:1) - no problems with networking, no hardware issues, etc. Output from "ceph df": GLOBAL: SIZE AVAIL RAW USED %RAW USED 166T 129T 38347G 22.44 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS .rgw 9 70330k 0 39879G 393178 .rgw.root 10 848 0 39879G 3 .rgw.control 11 0 0 39879G 8 .rgw.gc 12 0 0 39879G 32 .rgw.buckets 13 10007G 5.86 39879G 331079052 .rgw.buckets.index 14 0 0 39879G 2994652 .rgw.buckets.extra 15 0 0 39879G 2 .log 16 475M 0 39879G 408 .intent-log 17 0 0 39879G 0 .users 19 729 0 39879G 49 .users.email 20 414 0 39879G 26 .users.swift 21 0 0 39879G 0 .users.uid 22 17170 0 39879G 89 Problems began on last saturday, Troughput was 400k req per hour - mostly PUTs and HEADs ~100kb. Ceph version is hammer. We have two clusters with similar configuration and both experienced same problems at once. Any hints Latest output from "ceph -w": 2016-07-14 14:43:16.197131 osd.26 [WRN] 17 slow requests, 16 included below; oldest blocked for > 34.766976 secs 2016-07-14 14:43:16.197138 osd.26 [WRN] slow request 32.555599 seconds old, received at 2016-07-14 14:42:43.641440: osd_op(client.75866283.0:20130084 .dir.default.75866283.65796.3 [delete] 14.122252f4 ondisk+write+known_if_redirected e18788) currently commit_sent 2016-07-14 14:43:16.197145 osd.26 [WRN] slow request 32.536551 seconds old, received at 2016-07-14 14:42:43.660487: osd_op(client.75866283.0:20130121 .dir.default.75866283.65799.6 [delete] 14.d2dc1672 ondisk+write+known_if_redirected e18788) currently commit_sent 2016-07-14 14:43:16.197153 osd.26 [WRN] slow request 30.971549 seconds old, received at 2016-07-14 14:42:45.225490: osd_op(client.75866283.0:20132345 gc.12 [call rgw.gc_set_entry] 12.a45046b8 ack+ondisk+write+known_if_redirected e18788) currently waiting for rw locks 2016-07-14 14:43:16.197158 osd.26 [WRN] slow request 30.967568 seconds old, received at 2016-07-14 14:42:45.229471: osd_op(client.76495939.0:20147494 gc.12 [call rgw.gc_set_entry] 12.a45046b8 ack+ondisk+write+known_if_redirected e18788) currently waiting for rw locks 2016-07-14 14:43:16.197162 osd.26 [WRN] slow request 32.253169 seconds old, received at 2016-07-14 14:42:43.943870: osd_op(client.75866283.0:20130663 .dir.default.75866283.65805.7 [delete] 14.2b5a1672 ondisk+write+known_if_redirected e18788) currently commit_sent 2016-07-14 14:43:17.197429 osd.26 [WRN] 3 slow requests, 2 included below; oldest blocked for > 31.967882 secs 2016-07-14 14:43:17.197434 osd.26 [WRN] slow request 31.579897 seconds old, received at 2016-07-14 14:42:45.617456: osd_op(client.76495939.0:20147877 gc.12 [call rgw.gc_set_entry] 12.a45046b8 ack+ondisk+write+known_if_redirected e18788) currently waiting for rw locks 2016-07-14 14:43:17.197439 osd.26 [WRN] slow request 30.897873 seconds old, received at 2016-07-14 14:42:46.299480: osd_op(client.76495939.0:20148668 gc.12 [call rgw.gc_set_entry] 12.a45046b8 ack+ondisk+write+known_if_redirected e18788) currently waiting for rw locks Regards -- Jarosław Owsiewski
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com