[ceph-users] Podman pull error 'access denied'

2021-06-17 Thread Samy Ascha
Hi! I have a problem after starting to upgrade to 16.2.4, from 15.2.13. I started the upgrade and it successfully redeployed 2 out of 3 mgr daemon containers. The third failed to upgrade and Cephadm started retrying to upgrade it forever. The only way I could stop this was to disable the cephad

[ceph-users] Ceph Managers dieing?

2021-06-17 Thread Peter Childs
Lets try to stop this message turning into a mass moaning session about Ceph and try and get this newbie able to use it. I've got a Ceph Octopus cluster, its relatively new and deployed using cephadm. It was working fine, but now the managers start up run for about 30 seconds and then die, until

[ceph-users] Pulling Ceph Data Into Grafana

2021-06-17 Thread Alcatraz
Hello All, I recently installed Ceph (v 16.2.4 pacific stable). I know Ceph creates and exposes two Prometheus instances (from what I've witnessed). To that end, I installed Grafana in a docker container, and am attempting to pull metrics from Ceph (Cluster Health, OSD information, etc), but

[ceph-users] Re: Ceph Managers dieing?

2021-06-17 Thread Eugen Block
Hi, don't give up on Ceph. ;-) Did you try any of the steps from the troubleshooting section [1] to gather some events and logs? Could you share them, and maybe also some more details about that cluster? Did you enable any non-default mgr modules? There have been a couple reports related t

[ceph-users] Re: Ceph Managers dieing?

2021-06-17 Thread Peter Childs
Found the issue in the end I'd managed to kill the autoscaling features by playing with pgp_num and pg_num and it was getting confusing. I fixed it in the end by reducing pg_num on some of my test pools and the manager woke up and started working again. It was not clear as to what I'd done to kil

[ceph-users] Re: radosgw - Etags suffixed with #x0e

2021-06-17 Thread André Cruz
Hello Ingo. Did the problem actually went away after you upgraded everything to Nautilus? I’m seeing the same issue in a Luminous cluster where a Nautilus node was introduced (with the intent of upgrading the whole cluster to Nautilus). When the problem happened we had: Mons, Mgr - Nautilus OS

[ceph-users] Re: Ceph Managers dieing?

2021-06-17 Thread Andrew Walker-Brown
Changing pg_num and pgp_num manually can be a useful tool. Just remember that they need to be factor of 2, don’t increase or decease more than a couple of steps e.g. 64 to 128 or 256….but not to 1024 etc. I had a situation where a couple of OSDs got quite full. I added more capacity but the r

[ceph-users] Re: Ceph Managers dieing?

2021-06-17 Thread David Orman
Hi Peter, We fixed this bug: https://tracker.ceph.com/issues/47738 recently here: https://github.com/ceph/ceph/commit/b4316d257e928b3789b818054927c2e98bb3c0d6 which should hopefully be in the next release(s). David On Thu, Jun 17, 2021 at 12:13 PM Peter Childs wrote: > > Found the issue in the