[ceph-users] Re: [nautilus] ceph tell hanging

2020-09-22 Thread Nico Schottelius
Follow up on the tell hanging: iterating over all osds and trying to raise the max-backfills gives hanging ceph tell processes like this: root 1007846 15.3 1.2 918388 50972 pts/5Sl 00:03 0:48 /usr/bin/python3 /usr/bin/ceph tell osd.4 injectargs --osd-max-backfill root 1007890

[ceph-users] Re: [nautilus] ceph tell hanging

2020-09-22 Thread Nico Schottelius
So the same problem happens with pgs which are in "unknown" state, [19:31:08] black2.place6:~# ceph pg 2.5b2 query | tee query_2.5b2 hangs until the pg actually because active again. I assume that this should not be the case, should it? Nico Schottelius writes: > Update to the update: curre

[ceph-users] Re: [nautilus] ceph tell hanging

2020-09-22 Thread Stefan Kooman
Hi, > However as soon as we issue either of the above tell commands, it just > hangs. Furthermore when ceph tell hangs, pg are also becoming stuck in > "Activating" and "Peering" states. > > It seems to be related, as soon as we stop ceph tell (ctrl-c it), a few > minutes later the pgs are peered

[ceph-users] Re: [nautilus] ceph tell hanging

2020-09-22 Thread Nico Schottelius
Update to the update: currently debugging why pgs are stuck in the peering state: [18:57:49] black2.place6:~# ceph pg dump all | grep 2.7d1 dumped all 2.7d1 1 00 0 0 69698617344 0 0 3002 3002

[ceph-users] Re: [nautilus] ceph tell hanging

2020-09-22 Thread Nico Schottelius
I started now to iterate over all osds in the tree and some of the osds are completely unresponsive: [18:27:18] black1.place6:~# for osd in $(ceph osd tree | grep osd. | awk '{ print $4 }'); do echo $osd; ceph tell $osd injectargs '--osd-max-backfills 1'; done osd.20 osd.56 osd.62 osd.63 ^CT

[ceph-users] Re: [nautilus] ceph tell hanging

2020-09-22 Thread Nico Schottelius
Hello Stefan, Stefan Kooman writes: > Hi, > >> However as soon as we issue either of the above tell commands, it just >> hangs. Furthermore when ceph tell hangs, pg are also becoming stuck in >> "Activating" and "Peering" states. >> >> It seems to be related, as soon as we stop ceph tell (ctrl