[ceph-users] Re: Call for interest: VMWare Photon OS support in Cephadm

2024-03-15 Thread Alvaro Soto
Hi Ernesto, Do you have any kind of support matrix you want to achieve? O what are the test you want to run? Cheers! --- Alvaro Soto. Note: My work hours may not be your work hours. Please do not feel the need to respond during a time that is not convenient for you. --

[ceph-users] Call for interest: VMWare Photon OS support in Cephadm

2024-03-15 Thread Ernesto Puerta
Hi Cephers, This is to sound out potential interest in supporting VMWare's Photon OS distribution with Ceph/Cephadm. Photon OS is a lightweight distribution that is optimized for deploying containers within virtual machines. It is based on systemd, includes Docker, and uses the tdnf (Tiny DNF) pac

[ceph-users] Re: MDS_CLIENT_LATE_RELEASE, MDS_SLOW_METADATA_IO, and MDS_SLOW_REQUEST errors and slow osd_ops despite hardware being fine

2024-03-15 Thread Nathan Fish
What's the CPU and RAM usage look like for the OSDs? CPU has often been our bottleneck, with the main thread hitting 100%. On Fri, Mar 15, 2024 at 9:15 AM Ivan Clayson wrote: > > Hello everyone, > > We've been experiencing on our quincy CephFS clusters (one 17.2.6 and > another 17.2.7) repeated s

[ceph-users] Re: MDS_CLIENT_LATE_RELEASE, MDS_SLOW_METADATA_IO, and MDS_SLOW_REQUEST errors and slow osd_ops despite hardware being fine

2024-03-15 Thread Gregory Farnum
On Fri, Mar 15, 2024 at 6:15 AM Ivan Clayson wrote: > Hello everyone, > > We've been experiencing on our quincy CephFS clusters (one 17.2.6 and > another 17.2.7) repeated slow ops with our client kernel mounts > (Ceph 17.2.7 and version 4 Linux kernels on all clients) that seem to > originate fro

[ceph-users] Re: Cephfs error state with one bad file

2024-03-15 Thread Patrick Donnelly
Hi Sake, On Tue, Jan 2, 2024 at 4:02 AM Sake Ceph wrote: > > Hi again, hopefully for the last time with problems. > > We had a MDS crash earlier with the MDS staying in failed state and used a > command to reset the filesystem (this was wrong, I know now, thanks Patrick > Donnelly for pointing

[ceph-users] Re: MDS subtree pinning

2024-03-15 Thread Patrick Donnelly
Hi Sake, On Fri, Dec 22, 2023 at 7:44 AM Sake Ceph wrote: > > Hi! > > As I'm reading through the documentation about subtree pinning, I was > wondering if the following is possible. > > We've got the following directory structure. > / > /app1 > /app2 > /app3 > /app4 > > Can I pin /app1 t

[ceph-users] Re: Robust cephfs design/best practice

2024-03-15 Thread Alexander E. Patrakov
Hi Istvan, I would like to add a few notes to what Burkhard mentioned already. First, CephFS has a built-in feature that allows restricting access to a certain directory: ceph fs authorize cephfs client.public-only /public rw This creates a key with the following caps: caps mds = "allow rw pat

[ceph-users] MDS_CLIENT_LATE_RELEASE, MDS_SLOW_METADATA_IO, and MDS_SLOW_REQUEST errors and slow osd_ops despite hardware being fine

2024-03-15 Thread Ivan Clayson
Hello everyone, We've been experiencing on our quincy CephFS clusters (one 17.2.6 and another 17.2.7) repeated slow ops with our client kernel mounts (Ceph 17.2.7 and version 4 Linux kernels on all clients) that seem to originate from slow ops on osds despite the underlying hardware being fin

[ceph-users] Re: RGW - tracking new bucket creation and bucket usage

2024-03-15 Thread Konstantin Shalygin
Hi, > On 15 Mar 2024, at 01:07, Ondřej Kukla wrote: > > Hello I’m looking for suggestions how to track bucket creation over s3 api > and bucket usage (num of objects and size) of all buckets in time. > > In our RGW setup, we have a custom client panel, where like 85% percent of > buckets are

[ceph-users] Re: Robust cephfs design/best practice

2024-03-15 Thread Burkhard Linke
Hi, On 15.03.24 08:57, Szabo, Istvan (Agoda) wrote: Hi, I'd like to add cephfs to our production objectstore/block storage cluster so I'd like to collect hands on experiences like, good to know/be careful/avoid etc ... other than ceph documentation. Just some aspects that might not be obvi

[ceph-users] Re: CephFS space usage

2024-03-15 Thread Igor Fedotov
Hi Thorn, so the problem is apparently bound to huge file sizes. I presume they're split into multiple chunks at ceph side hence producing millions of objects. And possibly something is wrong with this mapping. If this pool has no write load at the moment you might want to run the following

[ceph-users] Robust cephfs design/best practice

2024-03-15 Thread Szabo, Istvan (Agoda)
Hi, I'd like to add cephfs to our production objectstore/block storage cluster so I'd like to collect hands on experiences like, good to know/be careful/avoid etc ... other than ceph documentation. Thank you This message is confidential and is for the sole use

[ceph-users] Num values for 3 DC 4+2 crush rule

2024-03-15 Thread Torkil Svensgaard
I was just looking at our crush rules as we need to change them from failure domain host to failure domain datacenter. The replicated ones seem trivial but what about this one for EC 4+2? rule rbd_ec_data { id 0 type erasure step set_chooseleaf_tries 5 step set_c

[ceph-users] Re: [REEF][cephadm] new cluster all pg unknown

2024-03-15 Thread wodel youchi
Hi, Thanks Stefan. Yes I do have separate port-channel interfaces, for public and cluster networks. I just didn't understand the documentation (which is sometimes not that clear). For me when you put forward the --cluster_network option, it meant that --mon-ip was in the public network by default

[ceph-users] Re: RGW - tracking new bucket creation and bucket usage

2024-03-15 Thread Janne Johansson
> Now we are using the GetBucketInfo from the AdminOPS api - > https://docs.ceph.com/en/quincy/radosgw/adminops/#id44 with the stats=true > option GET /admin/bucket?stats=1 which returns all buckets with the number of > objects and size we then parse. We also use it for the tracking of newly >

[ceph-users] Re: [REEF][cephadm] new cluster all pg unknown

2024-03-15 Thread Stefan Kooman
On 15-03-2024 08:10, wodel youchi wrote: Hi, I found my error, it was a mismatch between the monitor network ip address and the --cluster_network which were in different subnets. I misunderstood the --cluster_network subnet, I thought that when creating a cluster, the monitor IP designed the pub

[ceph-users] Re: [REEF][cephadm] new cluster all pg unknown

2024-03-15 Thread Stefan Kooman
On 15-03-2024 07:18, wodel youchi wrote: Hi, Note : Firewall is disabled on all hos Can you send us the crush rules that are available 1) and also the crush_rule in use for the .mgr pool 2)? Further more I would like to see an overview of the OSD tree 3) and the state of the .mgr PG (normall

[ceph-users] Re: [REEF][cephadm] new cluster all pg unknown

2024-03-15 Thread wodel youchi
Hi, I found my error, it was a mismatch between the monitor network ip address and the --cluster_network which were in different subnets. I misunderstood the --cluster_network subnet, I thought that when creating a cluster, the monitor IP designed the public Network, and if I wanted to separate pu

[ceph-users] PSA: CephFS/MDS config defer_client_eviction_on_laggy_osds

2024-03-15 Thread Venky Shankar
If you are using CephFS on Pacific v16.2.14(+), the MDS config `defer_client_eviction_on_laggy_osds' is enabled by default. This config is used to not evict cephfs clients if OSDs are laggy[1]. However, this can result in a single client holding up the MDS in servicing other clients. To avoid this,