[ceph-users] Re: Enterprise SSD/NVME

2025-01-10 Thread Anthony D'Atri
Interesting idea, not sure how well a utility could empirically test without a hardcoded list of SKUs, but here are some thoughts. * One difference is PLP - power loss protection. An enterprise drive must IMHO offer this and I don’t know offhand of one that doesn’t. Client / desktops drives o

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Frank Schilder
Hi all, here an update. The MDS got stuck again doing nothing. Could it be blocklisting? The MDS IP address is in the blocklist together with a bunch of others (see blocklist below). Could this have anything to do with my observation of the MDS coming up but not doing anything? Anyways, follow

[ceph-users] Enterprise SSD/NVME

2025-01-10 Thread Martin Konold
Hi there,it is well documented that Ceph performance is extremely poor with consumer ssd/nvme block devices.Recommending enterprise or data center devices is IMHO not sufficient as these terms are not really standardized.I propose to write a little system program which determines the properties of

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Frank Schilder
Hi Bailey, I already set that value very high: # ceph config get mds.ceph-12 mds_beacon_grace 60.00 To no avail. The 15s heartbeat timeout comes from somewhere else. What I observe is that the MDS loads the stray buckets (up to 87Mio DNS/INOS) and as soon as that happened it seems to s

[ceph-users] Re: squid 19.2.1 RC QE validation status

2025-01-10 Thread Adam Emerson
On 10/01/2025, Yuri Weinstein wrote: > This PR https://github.com/ceph/ceph/pull/61306 was cherry-picked > Adam, pls see the run for the Build 4 > > Laura, Adam approves rgw, we are ready for gibba and LRC/sepia upgrades. I hereby approve the RGW run. Thanks and sorry for the last minute fix. __

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Dan van der Ster
Hi Frank, I don't think the blocklists are related. (Those are blocking the previous running instances of the mds on that host, not the current). Your MDS is burning CPU (you see that with top) but it's unresponsive. Any of these will be closer to finding a clue what it's doing: perf top -p to

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Spencer Macphee
You could try some of the steps here Frank: https://docs.ceph.com/en/quincy/cephfs/troubleshooting/#avoiding-recovery-roadblocks mds_heartbeat_reset_grace is probably the only one really relevant to your scenario. On Fri, Jan 10, 2025 at 1:30 PM Frank Schilder wrote: > Hi all, > > we seem to ha

[ceph-users] Re: squid 19.2.1 RC QE validation status

2025-01-10 Thread Yuri Weinstein
This PR https://github.com/ceph/ceph/pull/61306 was cherry-picked Adam, pls see the run for the Build 4 Laura, Adam approves rgw, we are ready for gibba and LRC/sepia upgrades. Also dev lead pls review/approve Release Notes https://github.com/ceph/ceph/pull/61268 On Thu, Jan 9, 2025 at 8:12 AM A

[ceph-users] Re: Find out num of PGs that would go offline on OSD shutdown

2025-01-10 Thread Robert Sander
Hi, Am 1/10/25 um 10:40 schrieb Andre Tann: Proxmox VE warns me if I want to shut down an OSD, and some PGs would go offline if I continue. In the warning it tells me the exact number of affected PGs. ceph osd ok-to-stop will show you a list of affected PGs. Regards -- Robert Sander Heinl

[ceph-users] Re: Find out num of PGs that would go offline on OSD shutdown

2025-01-10 Thread Eugen Block
Hi, there's a check for multiple services (mon, osd, mds), here's how a osd check would look like: ceph osd ok-to-stop 0 -f json | jq { "ok_to_stop": true, "osds": [ 0 ], "num_ok_pgs": 135, "num_not_ok_pgs": 0, "ok_become_degraded": [ "1.0", "2.1", ... Regards, Eugen

[ceph-users] Re: Slow initial boot of OSDs in large cluster with unclean state

2025-01-10 Thread Frédéric Nass
- Le 9 Jan 25, à 17:33, Joshua Baergen jbaer...@digitalocean.com a écrit : >> I'm wondering about the influence of WAL/DBs collocated on HDDs on OSD >> creation >> time, OSD startup time, peering and osdmap updates, and the role it might >> play >> regarding flapping, when DB IOs compete w

[ceph-users] Re: Find out num of PGs that would go offline on OSD shutdown

2025-01-10 Thread Andre Tann
Hi Eugen, Am 10.01.25 um 10:47 schrieb Eugen Block: ceph osd ok-to-stop 0 -f json | jq {   "ok_to_stop": true,   "osds": [     0   ],   "num_ok_pgs": 135,   "num_not_ok_pgs": 0,   "ok_become_degraded": [     "1.0",     "2.1", This is exactly it, thanks for the hint, also to Robert wh

[ceph-users] Find out num of PGs that would go offline on OSD shutdown

2025-01-10 Thread Andre Tann
Hi all, Proxmox VE warns me if I want to shut down an OSD, and some PGs would go offline if I continue. In the warning it tells me the exact number of affected PGs. I have now searched a bit, but wasn't able to find out how Proxmox knows the exact number. How can that be done on the CLI? T

[ceph-users] Re: OSDs won't come back after upgrade

2025-01-10 Thread Jorge Garcia
Actually, stupid mistake on my part. I had selinux mode as enforcing. Changed it to disabled, and everything works again. Thanks for the help! ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Dan van der Ster
Hi Frank, Can you try `perf top` to find out what the ceph-mds process is doing with that CPU time? Also Mark's profiler is super useful to find those busy loops: https://github.com/markhpc/uwpmp Cheers, Dan -- Dan van der Ster CTO @ CLYSO Try our Ceph Analyzer -- https://analyzer.clyso.com/ htt

[ceph-users] Re: squid 19.2.1 RC QE validation status

2025-01-10 Thread Laura Flores
Thanks. We can plan the gibba upgrade for Monday. On Fri, Jan 10, 2025 at 4:11 PM Yuri Weinstein wrote: > This PR https://github.com/ceph/ceph/pull/61306 was cherry-picked > Adam, pls see the run for the Build 4 > > Laura, Adam approves rgw, we are ready for gibba and LRC/sepia upgrades. > > Als

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Spencer Macphee
mds_beacon_grace is, perhaps confusingly, not an MDS configuration. It's applied to MONs. As you've injected it into the MDS that is likely why the heartbeat is still failing: This has the effect of having the MDS continue to send beacons to the monitors even when its internal "heartbeat" mechanis

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Bailey Allison
+1 to this, and the doc mentioned. Just be aware depending on version the heartbeat grace parameter is different, I believe for 16 and below it's the one I mentioned, and it's to be set on the mon level, and for 17 and newer it is what Spencer mentioned. The doc he has provided also mentions s

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Frank Schilder
Hi Dan, thanks for our continued help, I really appreciate it. Just to clarify: > Your MDS is burning CPU (you see that with top) but it's unresponsive. Did you mean "is *not* burning CPU"? The MDS is idle - *no* CPU load, yet unresponsive. See below for a more detailed description of observati

[ceph-users] ceph tell throws WARN: the service id you provided does not exist.

2025-01-10 Thread Frank Schilder
Hi all, following the documentation for initiating a file system scrub (pacific: https://docs.ceph.com/en/pacific/cephfs/scrub/), I can't get the syntax with "mds." to work: # ceph tell mds.con-fs2:0 help WARN: the service id you provided does not exist. service id should be one of ceph-09/cep

[ceph-users] Per-Client Quality of Service settings

2025-01-10 Thread Olaf Seibert
Hi! I am trying to find if Ceph has any QoS settings that apply per-client. I would like to be able, say, to have different QoS settings for RBD clients named "nova" and "cinder" and different again from an RGW client named "enduser". While researching this, I found amongst other things, the f

[ceph-users] Re: Ceph Orchestrator ignores attribute filters for SSDs

2025-01-10 Thread Frédéric Nass
Hi Janek, Have you tried looking into the orchestrator's decisions? $ ceph config set mgr mgr/cephadm/log_to_cluster_level debug then $ ceph -W cephadm --watch-debug or look into active MGR's /var/log/ceph/$(ceph fsid)/ceph.cephadm.log Regards, Frédéric. - Le 10 Jan 25, à 13:53, Janek B

[ceph-users] Re: ceph tell throws WARN: the service id you provided does not exist.

2025-01-10 Thread Frank Schilder
Update: looks like the "help" sub-command is not implemented, which would be nice to have documentation on-line. The "initiate scrub command" did work though: [root@gnosis ~]# ceph tell mds.con-fs2:0 scrub start / recursive,scrub_mdsdir { "return_code": 0, "scrub_tag": "3b5e0bbc-1c8a-489

[ceph-users] Re: Per-Client Quality of Service settings

2025-01-10 Thread Anthony D'Atri
> On Jan 10, 2025, at 7:46 AM, Olaf Seibert wrote: > > Hi! I am trying to find if Ceph has any QoS settings that apply per-client. I > would like to be able, say, to have different QoS settings for RBD clients > named "nova" and "cinder" and different again from an RGW client named > "enduse

[ceph-users] Re: Per-Client Quality of Service settings

2025-01-10 Thread Olaf Seibert
Thanks for your reply, Anthony. On 10.01.25 15:27, Anthony D'Atri wrote: That link https://docs.ceph.com/en/reef/rbd/rbd-config-ref/#qos-settings does have a section that describes per-image (volume) settings, which you should be able to enforce on the OpenStack side. OpenStack / libvirt do h

[ceph-users] MDSs report oversized cache during forward scrub

2025-01-10 Thread Frank Schilder
Hi all, we started a forward scrub on our 5.x PB ceph file system and observe a massive ballooning of MDS caches. Our status is: # ceph status cluster: id: xxx health: HEALTH_WARN 1 MDSs report oversized cache (muted: MDS_CLIENT_LATE_RELEASE(12d) MDS_CLIENT_

[ceph-users] MDSs report oversized cache during forward scrub

2025-01-10 Thread Frank Schilder
Hi all, we started a forward scrub on our 5.x PB ceph file system and observe a massive ballooning of MDS caches. Our status is: # ceph status cluster: id: xxx health: HEALTH_WARN 1 MDSs report oversized cache (muted: MDS_CLIENT_LATE_RELEASE(12d) MDS_CLIENT_

[ceph-users] Ceph Orchestrator ignores attribute filters for SSDs

2025-01-10 Thread Janek Bevendorff
Hi, I'm having a strange problem with the orchestrator. My cluster has the following OSD services configured based on certain attributes of the disks: NAME PORTS  RUNNING  REFRESHED  AGE PLACEMENT ... osd.osd-default-hdd    1351  2m ago 22m label:osd;HOSTPREFIX* osd

[ceph-users] Re: MDSs report oversized cache during forward scrub

2025-01-10 Thread Frank Schilder
A further observation: after restarting rank 6 due to oversized cache, rank 6 is no longer shown in the task list of ceph status below. Is an instruction for scrub not sticky to the rank or is the status output incorrect? Best regards, = Frank Schilder AIT Risø Campus Bygning 109

[ceph-users] Re: ceph orch upgrade tries to pull latest?

2025-01-10 Thread tobias tempel
Hi Stephan, hi Adam, thank you for reading my questions and for your suggestions. I did what you suggested, but unfortunately in the meantime the cluster in question degraded to the point of un-manageability, because for reasons incomprehensible for me, cephadm began trying to redeploy and rec

[ceph-users] Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Frank Schilder
Hi all, we seem to have a serious issue with our file system, ceph version is pacific latest. After a large cleanup operation we had an MDS rank with 100Mio stray entries (yes, one hundred million). Today we restarted this daemon, which cleans up the stray entries. It seems that this leads to a

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Frank Schilder
Hi Patrick and others, thanks for your fast reply. The problem we are in comes from forward scrub ballooning and the memory overuse did not go away even after aborting the scrub. The "official" way to evaluate strays I got from Neha was to restart the rank. I did not expect that the MDS needs

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Spencer Macphee
I had a similar issue some months ago that ended up using around 300 gigabytes of RAM for a similar number of strays. You can get an idea of the strays kicking around by checking the omapkeys of the stray objects in the cephfs metadata pool. Strays are tracked in objects: 600., 601.000

[ceph-users] Re: Slow initial boot of OSDs in large cluster with unclean state

2025-01-10 Thread Joshua Baergen
> > FWIW, having encountered these long-startup issues many times in the > > past on both HDD and QLC OSDs, I can pretty confidently say that > > throwing flash at the problem doesn't make it go away. Fewer issues > > with DB IOs contending with client IOs, but flapping can still occur > > during P

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Patrick Donnelly
Hi Frank, On Fri, Jan 10, 2025 at 12:31 PM Frank Schilder wrote: > > Hi all, > > we seem to have a serious issue with our file system, ceph version is pacific > latest. After a large cleanup operation we had an MDS rank with 100Mio stray > entries (yes, one hundred million). Today we restarted

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Bailey Allison
Hi Frank, Are you able to share any logs from the mds that's crashing? And just to confirm the rank goes into up:active before eventually OOM ? This sounds familiar-ish but i'm also recovering after a nearly 24 hour bender of another ceph related recovery.trying to rack my brain of simil

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Frank Schilder
Hi all, I seem to have gotten the MDS up to the point that it reports stats. Does this mean anything: 2025-01-10T20:50:25.256+0100 7f87ccd5f700 1 heartbeat_map is_healthy 'MDSRank' had timed out after 15.00954s 2025-01-10T20:50:25.256+0100 7f87ccd5f700 0 mds.beacon.ceph-12 Skipping beacon

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Bailey Allison
HI Frank, What is the state of the mds currently? We are probably at a point where we do a bit of hope and waiting for it to come back up. Regards, Bailey Allison Service Team Lead 45Drives, Ltd. 866-594-7199 x868 On 1/10/25 15:51, Frank Schilder wrote: Hi all, I seem to have gotten the MD

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Frank Schilder
Hi all, I got the MDS up. however, after quite some time its sitting with almost no CPU load: top - 21:40:02 up 2:49, 1 user, load average: 0.00, 0.02, 0.34 Tasks: 606 total, 1 running, 247 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.1 sy, 0.0 ni, 99.9 id, 0.0 wa, 0.0 hi, 0.0

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Bailey Allison
Frank, You mentioned previously a large number of strays on the mds rank. Are you able to check the rank again to see how many strays there are again? We've previously had a similar issue, and once the MDS came back up we had to stat the filesystem to decrease the number of strays, and which

[ceph-users] Re: MDSs report oversized cache during forward scrub

2025-01-10 Thread Alexander Patrakov
Hello Frank, Unfortunately, the ballooning of memory consumed by the MDS is a known issue. Please add a lot of swap space (as a rough estimate, 2 GB of swap per 1 mln of files stored) to complete the scrub, and ignore the warning. Yes, there were cases on this list where 2 TB of swap was insuffi

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Frank Schilder
Hi Bailey, thanks for your response. The MDS was actually unresponsive and I had to restart it (ceph tell and ceph daemon commands were hanging, except for "help"). Its currently in clientreplay and loading all the stuff again. I'm really worried that this here is the rescue killer: heartbea

[ceph-users] Re: Help needed, ceph fs down due to large stray dir

2025-01-10 Thread Bailey Allison
Hi Frank, The value for that is mds_beacon_grace. Default is 15 but you can jack it up. Apply it to the monitor or global to take effect. Just to clarify too, does the MDS daemon come into up:active ? If it does, are you able to also access that portion of the filesystem in that time? If y