Re: [ceph-users] blocked ops

2016-08-12 Thread Roeland Mertens
% of that data kind regards, Roeland On 12/08/16 08:10, Brad Hubbard wrote: On Fri, Aug 12, 2016 at 07:47:54AM +0100, roeland mertens wrote: Hi Brad, thank you for that. Unfortunately our immediate concern is the blocked ops rather than the broken pg (we know why its broken). OK, if you

Re: [ceph-users] blocked ops

2016-08-11 Thread roeland mertens
ere hosting the broken pg. On 12 August 2016 at 04:12, Brad Hubbard wrote: > On Thu, Aug 11, 2016 at 11:33:29PM +0100, Roeland Mertens wrote: > > Hi, > > > > I was hoping someone on this list may be able to help? > > > > We're running a 35 node 10.2.1 cluster

[ceph-users] blocked ops

2016-08-11 Thread Roeland Mertens
Hi, I was hoping someone on this list may be able to help? We're running a 35 node 10.2.1 cluster with 595 OSDs. For the last 12 hours we've been plagued with blocked requests which completely kills the performance of the cluster # ceph health detail HEALTH_ERR 1 pgs are stuck inactive for m

[ceph-users] OSD crashes on EC recovery

2016-08-10 Thread Roeland Mertens
Hi, we run a Ceph 10.2.1 cluster across 35 nodes with a total of 595 OSDs, we have a mixture of normally replicated volumes and EC volumes using the following erasure-code-profile: # ceph osd erasure-code-profile get rsk8m5 jerasure-per-chunk-alignment=false k=8 m=5 plugin=jerasure ruleset-fa