[ceph-users] Cluster does not report which objects are unfound for stuck PG

2017-09-10 Thread Nikos Kormpakis
Hello people, after a series on events and some operational mistakes, 1 PG in our cluster is in active+recovering+degraded+remapped state, reporting 1 unfound object. We're running Hammer (v0.94.9) on top of Debian Jessie, on 27 nodes and 162 osds with the default crushmap and nodeep-scrub fla

Re: [ceph-users] radosgw fails with "ERROR: failed to initialize watch: (34) Numerical result out of range"

2018-01-16 Thread Nikos Kormpakis
;>>> (cf0baba3b47f9427c6c97e2144b094b7e5ba) luminous (stable), process >>>> (unknown), pid 13928 >>>> 2018-01-14 21:30:57.556672 7f44ddd18e00 -1 ERROR: failed to initialize >>>> watch: (34) Numerical result out of range >>>> 2018-01-14 2

[ceph-users] Open-sourcing GRNET's Ceph-related tooling

2018-05-09 Thread Nikos Kormpakis
Hello, I'm happy to announce that GRNET [1] is open-sourcing its Ceph-related tooling on GitHub [2]. This repo includes multiple monitoring health checks compatible with Luminous and tooling in order deploy quickly our new Ceph clusters based on Luminous, ceph-volume lvm and BlueStore. We hope t

[ceph-users] Inconsistent PG automatically got "repaired" automatically?

2018-05-09 Thread Nikos Kormpakis
me object can be stored on different blocks if recreated? Unfortunately, I'm not familiar with its internals. 4) Is there any reason why did slow requests appear? Can we correlate these requests somehow with our problem? This behavior looks very confusing from a first sight an

Re: [ceph-users] Inconsistent PG automatically got "repaired"?

2018-05-11 Thread Nikos Kormpakis
On 2018-05-10 00:39, Gregory Farnum wrote: On Wed, May 9, 2018 at 8:21 AM Nikos Kormpakis wrote: 1) After how much time RADOS tries to read from a secondary replica? Is this timeout configurable? 2) If a primary shard is missing, Ceph tries to recreate it somehow automatically? 3) If