Re: [ceph-users] Global, Synchronous Blocked Requests

Daniel Maraio Fri, 27 Nov 2015 19:52:50 -0800

Hello,

Can you provide some further details. What are the size of yourobjects, how many objects do you have in your buckets. Are you usingbucket index sharding, are you sharding your objects over multiplebuckets? Is the cluster doing any scrubbing during these periods? Itsounds like you may be having trouble with your rgw bucket index. In ourcluster, much smaller than yours mind you, it was necessary to put thergw bucket index onto it's own set of osds to isolate it from the restof the cluster IO. We are still using single object bucket indexes buthave a plan to move to sharded bucket index eventually.

You should determine what OSDs your bucket indexes are located on andsee if a pattern emerges with the OSDs have have slow requests duringthis periods. You can use the command ' ceph pg ls-by-pool.rgw.buckets.index ' to show what pgs/osds the bucket index resides on.


- Daniel

On 11/27/2015 10:24 PM, Brian Felton wrote:

Greetings Ceph Community,
We are running a Hammer cluster (0.94.3-1) in production that recentlyexperienced asymptotic performance degradation. We've been migratingdata from an older non-Ceph cluster at a fairly steady pace for thepast eight weeks (about 5TB a week). Overnight, the ingress ratedropped by 95%. Upon investigation, we found we were receivinghundreds of thousands of 'slow request' warnings.
The cluster is being used as an S3-compliant object storage solution.What has been extremely problematic is that all cluster writes arebeing blocked simultaneously. When something goes wrong, we'veobserved our transfer jobs (6-8 jobs, running across 4 servers) allsimultaneously block on writes for 10-60 seconds, then release andcontinue simultaneously. The blocks occur very frequently (at leastonce a minute after the previous block has cleared).
Our setup is as follows:
- 5 monitor nodes (VMs: 2 vCPU, 4GB RAM, Ubuntu 14.04.3, kernel3.13.0-48)
 - 2 RGW nodes (VMs: 2 vCPU, 4GB RAM, Ubuntu 14.04.3, kernel 3.13.0-48)
- 9 Storage nodes (Supermicro server: 32 CPU, 256GB RAM, Ubuntu14.04.3, kernel 3.13.0-46)
Each storage server contains 72 6TB SATA drives for Ceph (648 OSDs,~3.5PB in total). Each disk is set up as its own ZFS zpool. Each OSDhas a 10GB journal, located within the disk's zpool.
Other information that might be pertinent:
 - All servers (and VMs) use NTP to sync clocks.
 - The cluster uses k=7, m=2 erasure coding.
- Each storage server has 6 10Gbps ports, with 2 bonded for front-endtraffic and 4 bonded for back-end traffic.- Ingress and egress traffic is typically a few MB/sec tops, andwe've stress tested it at levels at least 100x what we normally see- We have pushed a few hundred TB into the cluster during burn-inwithout issue
Given the global nature of the failure, we initially suspectednetworking issues. After a solid day of investigation, we were unableto find any reason to suspect the network (no dropped packets on FE orBE networks, no ping loss, no switch issues, reasonable iperf tests,etc.). We next examined the storage nodes, but we found no failuresof any kind (nothing in system/kernel logs, no ZFS errors,iostat/atop/etc. all normal, etc.).
We've also attempted the following, with no success:
 - Rolling restart of the storage nodes
 - Rolling restart of the mon nodes
 - Complete shutdown/restart of all mon nodes
 - Expansion of RGW capacity from 2 servers to 5
 - Uncontrollable sobbing
Nothing about the cluster has changed recently -- no OS patches, noCeph patches, no software updates of any kind. For the months we'vehad the cluster operational, we've had no performance-related issues.In the days leading up to the major performance issue we're nowexperiencing, the logs did record 100 or so 'slow request' events of>30 seconds on subsequent days. After that, the slow requests becamea constant, and now our logs are spammed with entries like the following:
2015-11-28 02:30:07.328347 osd.116 192.168.10.10:6832/1689576<http://192.168.10.10:6832/1689576> 1115 : cluster [WRN] 2 slowrequests, 1 included below; oldest blocked for > 60.024165 secs2015-11-28 02:30:07.328358 osd.116 192.168.10.10:6832/1689576<http://192.168.10.10:6832/1689576> 1116 : cluster [WRN] slow request60.024165 seconds old, received at 2015-11-28 02:29:07.304113:osd_op(client.214858.0:6990585default.184914.126_2d29cad4962d3ac08bb7c3153188d23f [create 0~0[excl],setxattr user.rgw.idtag (22),writefull 0~523488,setxattruser.rgw.manifest (444),setxattr user.rgw.acl (371),setxattruser.rgw.content_type (1),setxattr user.rgw.etag (33)] 48.158d9795ondisk+write+known_if_redirected e15933) currently commit_sent
We've analyzed the logs on the monitor nodes (ceph.log andceph-mon.<id>.log), and there doesn't appear to be a smoking gun. The'slow request' events are spread fairly evenly across all 648 OSDs.
A 'ceph health detail' typically shows something like the following:

HEALTH_WARN 41 requests are blocked > 32 sec; 14 osds have slow requests
3 ops are blocked > 65.536 sec
38 ops are blocked > 32.768 sec
1 ops are blocked > 65.536 sec on osd.83
1 ops are blocked > 65.536 sec on osd.92
4 ops are blocked > 32.768 sec on osd.117
1 ops are blocked > 32.768 sec on osd.159
2 ops are blocked > 32.768 sec on osd.186
1 ops are blocked > 32.768 sec on osd.205
10 ops are blocked > 32.768 sec on osd.245
1 ops are blocked > 65.536 sec on osd.265
1 ops are blocked > 32.768 sec on osd.393
2 ops are blocked > 32.768 sec on osd.415
10 ops are blocked > 32.768 sec on osd.436
1 ops are blocked > 32.768 sec on osd.467
5 ops are blocked > 32.768 sec on osd.505
1 ops are blocked > 32.768 sec on osd.619
14 osds have slow requests
We have rarely seen requests eclipse the 120s warning threshold. Thevast majority show > 30 seconds, with a few running longer than 60seconds. The cluster will return to a HEALTH_OK status periodically,especially when under light/no load.
At this point, we've pushed about 42TB into the cluster, so we'restill under 1.5% utilization. The performance degradation we areexperiencing was immediate, severe, and has been ongoing for severaldays now. I am looking for any guidance on how to further diagnose orresolve the issue. I have reviewed several similar threads on thislist, but the proposed solutions were either not applicable to oursituation or did not work.
Please let me know what other information I can provide or what I cando to gather additional information.
Many thanks,

Brian




_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Global, Synchronous Blocked Requests

Reply via email to