[ceph-users] Re: Slow cluster / misplaced objects - Ceph 15.2.9

2021-02-27 Thread Anthony D'Atri
With older releases, Michael Kidd’s log parser scripts were invaluable, notably map_reporters_to_buckets.sh https://github.com/linuxkidd/ceph-log-parsers With newer releases, at least, one can send `dump_blocked_ops` to the OSD admin socket. I collect these via Prometheus / node_exporter, it’

[ceph-users] Re: 15.2.8 mgr keep crashing every few days

2021-02-27 Thread levin ng
Hi, Sebastian, thanks for your suggestion, i'd tried to turn off prometheus as the post suggested, but still crash every few days. Then look into the mgr container and found lots of defunct ssh processes every few minutes, it seems related to remote exec issues. Tired to turn on mgr debug and look

[ceph-users] Re: Slow cluster / misplaced objects - Ceph 15.2.9

2021-02-27 Thread David Orman
Excellent, that's a great start. We do use prometheus/grafana already, and we are collecting the data - so we'll make sure we add alertmanager coverage. I was looking at dump_historic_ops but it wasn't really showing _what_ the cause was; we'd see operations that took a longer period of time (such

[ceph-users] Getting started with cephadm

2021-02-27 Thread Peter Childs
I'm new to ceph, and I've been trying to set up a new cluster with 16 computers with 30 disks each and 6 SSD (plus boot disks), 256G of memory, IB Networking. (ok its currently 15 but never mind) When I take them over about 10 OSD's each they start having problems starting the OSD up and I can nor

[ceph-users] Re: Getting started with cephadm

2021-02-27 Thread David Orman
Podman is fine (preferably 3.0+). What were those variables set to before? With most recent distributions and kernels we've not noticed a problem with the defaults. Did you notice errors that lead to you changing them? We have many clusters of 21 nodes, 24 HDDs each, multiple NVMEs serving as WAL/D

[ceph-users] Re: 15.2.8 mgr keep crashing every few days

2021-02-27 Thread David Orman
This is fixed with 15.2.9 and the patch which was merged into that release to fix the threading issue, coupled with an update cheroot release now in the docker image. We're running 15.2.9 with no issue, now! On Thu, Feb 11, 2021 at 4:00 AM Sebastian Luna Valero wrote: > > Hi, > > The following th

[ceph-users] Re: [Suspicious newsletter] RGW: Multiple Site does not sync olds data

2021-02-27 Thread 特木勒
Hi Istvan: Thanks for your reply. Does directional sync solve the problem? I tried to run `radosgw-admin sync init`, bit it still did not work. :( Thanks Szabo, Istvan (Agoda) 于2021年2月26日周五 上午7:47写道: > Same for me, 15.2.8 also. > I’m trying directional sync now, looks like symmetrical has iss