[ceph-users] Multiple object instances with null version id

2023-07-24 Thread Huy Nguyen
Hi, I have a Ceph cluster in v16.2.13. I'm not sure why does this happen and how to clean it? [2023-07-12 21:23:13 +07] 299B STANDARD null v18 PUT index.txt [2023-07-12 21:27:54 +07] 299B STANDARD null v17 PUT index.txt [2023-07-12 21:48:01 +07] 299B STANDARD null v16 PUT index.txt [2023-0

[ceph-users] inactive PGs looking for a non existent OSD

2023-07-24 Thread Alfredo Rezinovsky
I had a problem with a server, hardware completely broken. "ceph orch rm host" hanged, even with force and offline options I reinstalled other server with the same IP address and then I removed the OSD with: ceph osd purge osd.10 ceph osd purge osd.11 Now I have 0.342% pgs not active with ce

[ceph-users] Re: quincy 17.2.6 - write performance continuously slowing down until OSD restart needed

2023-07-24 Thread stachecki . tyler
Mark, I ran into something similar to this recently while testing Quincy... I believe I see what happened here. Based on the users information, the following non-default option was in use: ceph config set osd bluestore_rocksdb_options compression=kNoCompression,max_write_buffer_number=32,min_wri

[ceph-users] Re: upload-part-copy gets access denied after cluster upgrade

2023-07-24 Thread motaharesdq
*Edit: Sorry for the mistake; service-user was a local term for sub-user & I forgot to rephrase it (even upper-cased it for unknown reasons:)). Basically the error exists for UploadPartCopy by any subuser whose access is granted via bucket policy:* and not --access=full-access. The problem remai

[ceph-users] Re: RGWs offline after upgrade to Nautilus

2023-07-24 Thread bzieglmeier
Changing sending email address as something was wrong with my last one. Still OP here. Cluster is generally healthy. Not running out of storage space or pools filling up. As mentioned in the original post, one RGW is able to come online. I've cross-compared about every file permission, config f

[ceph-users] Regressed tail (p99.99+) write latency for RBD workloads in Quincy (vs. pre-Pacific)?

2023-07-24 Thread Tyler Stachecki
Hi all, Has anyone else noticed any p99.99+ tail latency regression for RBD workloads in Quincy vs. pre-Pacific, i.e., before the kv_onode cache existed? Some notes from what I have seen thus far: * Restarting OSDs temporarily resolves the problem... then as activity accrues over time, the proble

[ceph-users] Re: Not all Bucket Shards being used

2023-07-24 Thread J. Eric Ivancich
1. I recommend that you *not* issue another bucket reshard until you figure out what’s going on. 2. Which version of Ceph are you using? 3. Can you issue a `radosgw-admin metadata get bucket:` so we can verify what the current marker is? 4. After you resharded previously, did you get command-line

[ceph-users] Re: cephadm and kernel memory usage

2023-07-24 Thread Luis Domingues
Of course: free -h totalusedfree shared buff/cache available Mem: 125Gi96Gi 9.8Gi 4.0Gi19Gi 7.6Gi Swap:0B 0B 0B Luis Domingues Proton AG --- Original Message --- On Monday,

[ceph-users] Re: MDS stuck in rejoin

2023-07-24 Thread Frank Schilder
Hi Xiubo, I seem to have gotten your e-mail twice. Its a very old kclient. It was in that state when I came to work in the morning and I looked at it in the afternoon. Was hoping the problem would clear by itself. It was probably a compute job that crashed it, its a compute node in our HPC cl

[ceph-users] Re: Does ceph permit the definition of new classes?

2023-07-24 Thread Konstantin Shalygin
Hi, You definitely can add any other class name k Sent from my iPhone > On 24 Jul 2023, at 16:04, wodel youchi wrote: > > Can I define new device classes in ceph, I know that there are hdd, ssd and > nvme, but can I define other classes? ___ ceph-use

[ceph-users] Re: cephadm and kernel memory usage

2023-07-24 Thread Konstantin Shalygin
Hi, Can you paste `free -h` output for this hosts? k Sent from my iPhone > On 24 Jul 2023, at 14:42, Luis Domingues wrote: > > Hi, > > So after, looking into OSDs memory usage, which seem to be fine, on a > v16.2.13 running with cephadm, on el8, it seems that the kernel is using a > lot o

[ceph-users] Re: Failing to restart mon and mgr daemons on Pacific

2023-07-24 Thread Adam King
The logs you probably really want to look at here are the journal logs from the mgr and mon. If you have a copy of the cephadm tool on the host, you can do a "cephadm ls --no-detail | grep systemd" to list out the systemd unit names for the ceph daemons on the host, or just look find the systemd un

[ceph-users] Failing to restart mon and mgr daemons on Pacific

2023-07-24 Thread Renata Callado Borges
Dear all, How are you? I have a cluster on Pacific with 3 hosts, each one with 1 mon,  1 mgr and 12 OSDs. One of the hosts, darkside1, has been out of quorum according to ceph status. Systemd showed 4 services dead, two mons and two mgrs. I managed to systemctl restart one mon and one mg

[ceph-users] Re: Does ceph permit the definition of new classes?

2023-07-24 Thread Alwin Antreich
Hi, July 24, 2023 3:02 PM, "wodel youchi" wrote: > Hi, > > Can I define new device classes in ceph, I know that there are hdd, ssd and > nvme, but can I define other classes? Certainly We often use dedicated device classes (eg. nvme-meta) to separate workloads. Cheers, Alwin PS: this time re

[ceph-users] Does ceph permit the definition of new classes?

2023-07-24 Thread wodel youchi
Hi, Can I define new device classes in ceph, I know that there are hdd, ssd and nvme, but can I define other classes? Regards. Virus-free.www.avast.com

[ceph-users] cephadm and kernel memory usage

2023-07-24 Thread Luis Domingues
Hi, So after, looking into OSDs memory usage, which seem to be fine, on a v16.2.13 running with cephadm, on el8, it seems that the kernel is using a lot of memory. # smem -t -w -k Area Used Cache Noncache firmware/hardware 0 0 0 kernel image 0 0 0 kernel dynamic memory 65.0G 18.6G 46.4G userspac

[ceph-users] Re: Adding datacenter level to CRUSH tree causes rebalancing

2023-07-24 Thread Niklas Hambüchen
I can believe the month timeframe for a cluster with multiple large spinners behind each HBA. I’ve witnessed such personally. I do have the numbers for this: My original post showed "1167541260/1595506041 objects misplaced (73.177%)". During my last recovery with Ceph 16.2.7, the recovery sp

[ceph-users] Re: mds terminated

2023-07-24 Thread Xiubo Li
On 7/20/23 11:36, dxo...@naver.com wrote: This issue has been closed. If any rook-ceph users see this, when mds replay takes a long time, look at the logs in mds pod. If it's going well and then abruptly terminates, try describing the mds pod, and if liveness probe terminated, try increasing

[ceph-users] Re: cephfs - unable to create new subvolume

2023-07-24 Thread Milind Changire
On Fri, Jul 21, 2023 at 9:03 PM Patrick Donnelly wrote: > > Hello karon, > > On Fri, Jun 23, 2023 at 4:55 AM karon karon wrote: > > > > Hello, > > > > I recently use cephfs in version 17.2.6 > > I have a pool named "*data*" and a fs "*kube*" > > it was working fine until a few days ago, now i can

[ceph-users] Re: MDS cache is too large and crashes

2023-07-24 Thread Sake Ceph
Thank you Patrick for responding and fix the issue! Good to know the issue is know and been worked on :-) > Op 21-07-2023 15:59 CEST schreef Patrick Donnelly : > > > Hello Sake, > > On Fri, Jul 21, 2023 at 3:43 AM Sake Ceph wrote: > > > > At 01:27 this morning I received the first email abou