[ceph-users] Re: Ceph Health error right after starting balancer

2019-11-01 Thread Paul Emmerich
Looks like you didn't tell the whole story, please post the *full* output of ceph -s and ceph osd df tree. Wild guess: you need to increase "mon max pg per osd" Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 Mün

[ceph-users] Re: V/v Multiple pool for data in Ceph object

2019-11-01 Thread tuan dung
Ok, thanks. Br, -- Dương Tuấn Dũng Email: dungdt.aicgr...@gmail.com ĐT: 0986153686 On Wed, Oct 30, 2019 at 1:51 PM Konstantin Shalygin wrote: > On 10/29/19 3:45 PM, tuan dung wrote: > > i have a cluster run ceph object using version 14.2.1. I want

[ceph-users] Re: subtrees have overcommitted (target_size_bytes / target_size_ratio)

2019-11-01 Thread Lars Täuber
Is there anybody who can explain the overcommitment calcuation? Thanks Mon, 28 Oct 2019 11:24:54 +0100 Lars Täuber ==> ceph-users : > Is there a way to get rid of this warnings with activated autoscaler besides > adding new osds? > > Yet I couldn't get a satisfactory answer to the question w

[ceph-users] Re: subtrees have overcommitted (target_size_bytes / target_size_ratio)

2019-11-01 Thread Sage Weil
This was fixed a few weeks back. It should be resolved in 14.2.5. https://tracker.ceph.com/issues/41567 https://github.com/ceph/ceph/pull/31100 sage On Fri, 1 Nov 2019, Lars Täuber wrote: > Is there anybody who can explain the overcommitment calcuation? > > Thanks > > > Mon, 28 Oct 2019 11

[ceph-users] Re: subtrees have overcommitted (target_size_bytes / target_size_ratio)

2019-11-01 Thread Lars Täuber
Thanks a lot! Lars Fri, 1 Nov 2019 13:03:25 + (UTC) Sage Weil ==> Lars Täuber : > This was fixed a few weeks back. It should be resolved in 14.2.5. > > https://tracker.ceph.com/issues/41567 > https://github.com/ceph/ceph/pull/31100 > > sage > > > On Fri, 1 Nov 2019, Lars Täuber wrote:

[ceph-users] Re: Ceph Health error right after starting balancer

2019-11-01 Thread Thomas
Hi Paul, the situation has changed in the meantime. However, I can reproduce a similar behaviour. This means, - I disable balancer (ceph balancer off) - and then start reweighting of a specific OSD (ceph osd reweight 134 1.0) The cluster immediatelly reports slow requests. root@ld3955:~# ceph

[ceph-users] mgr daemons becoming unresponsive

2019-11-01 Thread Oliver Freyermuth
Dear Cephers, this is a 14.2.4 cluster with device health metrics enabled - since about a day, all mgr daemons go "silent" on me after a few hours, i.e. "ceph -s" shows: cluster: id: 269cf2b2-7e7c-4ceb-bd1b-a33d915ceee9 health: HEALTH_WARN no active mgr 1/3

[ceph-users] Weird blocked OP issue.

2019-11-01 Thread Robert LeBlanc
We had an OSD host with 13 OSDs fail today and we have a weird blocked OP message that I can't understand. There are no OSDs with blocked ops, just `mon` (multiple times), and some of the rgw instances. cluster: id: 570bcdbb-9fdf-406f-9079-b0181025f8d0 health: HEALTH_WARN 1

[ceph-users] Re: mgr daemons becoming unresponsive

2019-11-01 Thread Oliver Freyermuth
Dear Cephers, interestingly, after: ceph device monitoring off the mgrs seem to be stable now - the active one still went silent a few minutes later, but the standby took over and was stable, and restarting the broken one, it's now stable since an hour, too, so probably, a restart of the mgr is

[ceph-users] Re: mgr daemons becoming unresponsive

2019-11-01 Thread Sage Weil
On Sat, 2 Nov 2019, Oliver Freyermuth wrote: > Dear Cephers, > > interestingly, after: > ceph device monitoring off > the mgrs seem to be stable now - the active one still went silent a few > minutes later, > but the standby took over and was stable, and restarting the broken one, it's > now st

[ceph-users] Re: Weird blocked OP issue.

2019-11-01 Thread Robert LeBlanc
On Fri, Nov 1, 2019 at 6:10 PM Robert LeBlanc wrote: > > We had an OSD host with 13 OSDs fail today and we have a weird blocked > OP message that I can't understand. There are no OSDs with blocked > ops, just `mon` (multiple times), and some of the rgw instances. > > cluster: >id: 570bcd