[ceph-users] Is there a way to throttle faster osds due to slow ops?

2024-09-30 Thread Szabo, Istvan (Agoda)
Hi, We have extended our clusters with some new nodes and currently it is impossible to remove from any old node the nvme drive which holding the index pool in the cluster without generating slow ops and cluster performance degradation. Currently how I want to remove is in quincy non cephadm c

[ceph-users] Re: Using XFS and LVM backends together on the same cluster and hosts

2024-09-30 Thread Anthony D'Atri
BlueStore vs Filestore doesn’t matter beyond each OSD. Filestore is very deprecated so you’ll want to redeploy any Filestore OSDs when you can. `ceph osd metadata` can survey. I’ve had multiple issues over time with the MG spinners fwiw. For what SAS spinners cost with some effort you ca

[ceph-users] Using XFS and LVM backends together on the same cluster and hosts

2024-09-30 Thread Özkan Göksu
Hello folks! I hope you are doing well :) I have a general question about XFS and LVM backend OSD performance and possible effects if they are used together in the same pool. I built a cluster 5 years ago with Nautilus and I used the XFS backend for OSD's. After 5 years they reached me back with

[ceph-users] Re: SLOW_OPS problems

2024-09-30 Thread Anthony D'Atri
My point is that you may have more 10-30s delays that aren’t surfaced. > On Sep 30, 2024, at 10:17 AM, Tim Sauerbein wrote: > > Thanks for the replies everyone! > >> On 30 Sep 2024, at 13:10, Anthony D'Atri wrote: >> >> Remember that slow ops are a top of the iceberg thing, you only see on

[ceph-users] Membership additions/removals from the Ceph Steering Committee

2024-09-30 Thread Patrick Donnelly
Greetings, The Ceph Steering Committee [1] (CSC) -- formerly the "CLT" or Ceph Leadership Team -- was formed 3 years ago with the adoption of our new governance model [2]. The CSC is responsible [4] for electing the Ceph Executive Council [3] (CEC), amending the governance model, and deciding on t

[ceph-users] Ceph Steering Committee (a.k.a. CLT) Meeting Minutes 2024-09-30

2024-09-30 Thread Patrick Donnelly
Hello, Today we only discussed the upcoming election [5]. - A test election was conducted using Helios: [3] - A date for the election was discussed. The eventual consensus was that we should hold the election offline and meet during our normal call for any points of discussion. - The executive co

[ceph-users] Re: SLOW_OPS problems

2024-09-30 Thread Tim Sauerbein
Thanks for the replies everyone! > On 30 Sep 2024, at 13:10, Anthony D'Atri wrote: > > Remember that slow ops are a top of the iceberg thing, you only see ones that > crest above 30s So far metrics of the hosted VMs show no other I/O slowdown except when these hiccups occur. > On 30 Sep 2024

[ceph-users] Re: SLOW_OPS problems

2024-09-30 Thread Alexander Schreiber
On Mon, Sep 30, 2024 at 11:04:30AM +0100, Tim Sauerbein wrote: > > > On 30 Sep 2024, at 06:23, Joachim Kraftmayer > > wrote: > > > > do you see the behaviour across all devices or does it only affect one > > type/manufacturer? > > All devices are affected equally, every time one or two random

[ceph-users] Re: SLOW_OPS problems

2024-09-30 Thread Igor Fedotov
Hi Tim, there is no log attached to your post, you better share it via some other means. BTW - what log did you mean - monitor or OSD one? It would be nice to have logs for a couple of OSDs suffering from slow ops, preferably relevant to two different cases. Thanks, Igor On 9/29/2024 3

[ceph-users] Re: SLOW_OPS problems

2024-09-30 Thread Anthony D'Atri
Remember that slow ops are a top of the iceberg thing, you only see ones that crest above 30s > On Sep 30, 2024, at 6:06 AM, Tim Sauerbein wrote: > >  >> On 30 Sep 2024, at 06:23, Joachim Kraftmayer >> wrote: >> >> do you see the behaviour across all devices or does it only affect one >> t

[ceph-users] Re: SLOW_OPS problems

2024-09-30 Thread Tim Sauerbein
> On 30 Sep 2024, at 06:23, Joachim Kraftmayer > wrote: > > do you see the behaviour across all devices or does it only affect one > type/manufacturer? All devices are affected equally, every time one or two random ODSs report slow ops. So I don't think the SSDs are to blame. Thanks, Tim _

[ceph-users] Dashboard: frequent queries for balancer status

2024-09-30 Thread Eugen Block
Hi, I just noticed across different Ceph versions that when browsing the dashboard, the MGR is logging lots of prometheus queries for the balancer status: Sep 30 11:15:55 host2 ceph-mgr[3993215]: log_channel(cluster) log [DBG] : pgmap v25341: 381 pgs: 381 active+clean; 3.9 GiB data, 69 Gi