[ceph-users] Re: Something like RAID0 with Ceph

2024-11-18 Thread Frédéric Nass
Hi Christoph, No. Splitting data and distributing it across multiple OSDs without generating parity chunks is not possible in Ceph (k=2,m=0 with erasure coding wouldn't make sens). Either you use replication or erasure coding with at least a coding chunck (m>=1). Regards, Frédéric. - Le 1

[ceph-users] Re: Something like RAID0 with Ceph

2024-11-18 Thread Janne Johansson
Den tis 19 nov. 2024 kl 03:15 skrev Christoph Pleger : > Hello, > Is it possible to have something like RAID0 with Ceph? > That is, when the cluster configuration file contains > > osd pool default size = 4 This means all data is replicated 4 times, in your case, one piece per OSD, which also in y

[ceph-users] Re: Stray monitor

2024-11-18 Thread Jakub Daniel
I have found these two threads https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/VHZ7IJ7PAL7L2INLSHNVYY7V7ZCXD46G/#TSWERUMAEEGZPSYXG6PSS4YMRXPP3L63 https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/NG5QVRTVCLLYNLK56CSYLIPE4WBFXS5U/#HJDBAJFX27KATC4WV2MKGLVGLN2HTWWD but

[ceph-users] Something like RAID0 with Ceph

2024-11-18 Thread Christoph Pleger
Hello, Is it possible to have something like RAID0 with Ceph? That is, when the cluster configuration file contains osd pool default size = 4 and I have four hosts with one osd drive (all the same size, let's call it size s) per host, is it somehow possible to add four other hosts with one osd

[ceph-users] Re: Pacific: mgr loses osd removal queue

2024-11-18 Thread Frédéric Nass
Hi Eugen, I've removed 12 OSDs with a 'ceph orch osd rm ID --replace' last week on Pacific and even though only 10 OSDs started draining their PGs at a time (the other 2 waiting for an available 'slot', obviously) all 12 OSDs got removed successfully at the end. Cheers, Frédéric. - Le 16

[ceph-users] What is the Best stable option for production env in Q4/24 Quincy or Reef?

2024-11-18 Thread Özkan Göksu
Hello! I always track Ceph releases 1-2 versions behind and I used Quincy for all my deployments this year. I’m planning to deploy a new cluster on Ubuntu 22.04 and would like to know if Reef v18.2.4 is stable and ready for production environments in 2025. My use case involves RBD, CephFS, and R

[ceph-users] Re: The effect of changing an osd's class

2024-11-18 Thread Roland Giesler
On 2024/11/17 15:20, Gregory Orange wrote: On 17/11/24 19:44, Roland Giesler wrote: I cannot see any option that allows me to disable mclock... It's not so much disabling mclock as changing the op queue scheduler to use wpq instead of it. https://docs.ceph.com/en/reef/rados/configuration/osd-c

[ceph-users] Re: Ceph Octopus packages missing at download.ceph.com

2024-11-18 Thread Gregory Farnum
Octopus should be back; sorry for the inconvenience. That said, everybody should really have upgraded past that by now. :) -Greg On Sun, Nov 17, 2024 at 6:40 AM Tim Holloway wrote: > As to the comings and goings of Octopus from download.ceph.com I cannot > speak. I had enough grief when IBM Red

[ceph-users] Re: Stray monitor

2024-11-18 Thread Eugen Block
I must admit it’s a bit difficult to follow what exactly you did, so I’m just considering this thread to be resolved unless you state otherwise. ;-) Zitat von Jakub Daniel : Hi, thank you Eugen and Tim. did you fail the mgr? I think I didn't. Or how exactly did you drain that host?

[ceph-users] Re: Stray monitor

2024-11-18 Thread Jakub Daniel
Hi, thank you Eugen and Tim. > did you fail the mgr? I think I didn't. > Or how exactly did you drain that host? ``` cephadm shell -- ceph orch host drain cephfs-cluster-node-2 cephadm shell -- ceph orch host rm cephfs-cluster-node-2 ``` > `ceph config-key get mgr/cephadm/host.cephfs-cluster-

[ceph-users] Re: constant increase in osdmap epoch

2024-11-18 Thread Eugen Block
I would probably wait until rebalancing has finished, then increase debug level for the mon leader (not sure which debug level would suffice, maybe gradually increase debug_mon in steps of 5 or so until something comes up). We had to do a similar analysis last year in a customer cluster whe

[ceph-users] Re: constant increase in osdmap epoch

2024-11-18 Thread Frank Schilder
Hi Eugen. > how much changes do you see? About 1 new map every 5-10 seconds. The time interval varies. > but I would first try to find out what exactly is causing them That's what I'm trying to do. However, I'm out of ideas what to look for. I followed all the cases I could find to no avail. The

[ceph-users] Re: constant increase in osdmap epoch

2024-11-18 Thread Eugen Block
Hi Frank, do you use snapshots a lot? Purging snaps would create a new osdmap as well. Have you checked debug logs of the mon leader to see what triggers the osdmap change? I see in our moderately used Pacific cluster "only" around 60 osdmap changes per day (I haven't looked too deep yet)

[ceph-users] Re: done, waiting for purge

2024-11-18 Thread Eugen Block
Hi, I'm not sure if the force flag will help here, but you could try (you should probably cancel the current operation first with 'ceph orch osd rm stop {ID}' and retry with force). When I had a similar situation last time, I think I just went ahead and purged the OSDs myself to let the o

[ceph-users] Re: Stray monitor

2024-11-18 Thread Eugen Block
Hi, just to be safe, did you fail the mgr? If not, try 'ceph mgr fail' and see if it still reports that information. It sounds like you didn't clean up your virtual MON after you drained the OSDs. Or how exactly did you drain that host? If you run 'ceph orch host drain {host}' the orchest

[ceph-users] constant increase in osdmap epoch

2024-11-18 Thread Frank Schilder
Hi all, we observe a problem that has been reported before, but I can't find a resolution. This is related to an earlier thread "failed to load OSD map for epoch 2898146, got 0 bytes" (https://www.spinics.net/lists/ceph-users/msg84485.html). We run an octopus latest cluster and observe a const

[ceph-users] Re: The effect of changing an osd's class

2024-11-18 Thread Anthony D'Atri
Glad you’re sorted out. I had a feeling it was a function of not being able to satisfy pool / rule constraints. > On Nov 18, 2024, at 1:58 AM, Roland Giesler wrote: > > On 2024/11/17 18:12, Anthony D'Atri wrote: >> I see 5 OSDs with 0 CRUSH weight, is that intentional? > > Yes, I set the wei

[ceph-users] Re: ceph cluster planning size / disks

2024-11-18 Thread Anthony D'Atri
> > Thanks! Very good links! :) > > I need to subtract from the usable capacity max usable/server count to handle > 1 server failure. Anything else I need to subtract? That buffer for server failure recovery is a good idea and often missed. These days though Ceph is pretty good at detecting

[ceph-users] done, waiting for purge

2024-11-18 Thread Torkil Svensgaard
Hi 18.2.4 We had some hard drives going AWOL due to a failing SAS expander so I initiated "ceph orch host drain host". After a couple days I'm now looking at this: " OSD HOST STATEPGS REPLACE FORCE ZAPDRAIN STARTED AT 528 gimpy done, waiting for purge0

[ceph-users] Re: ceph cluster planning size / disks

2024-11-18 Thread Joachim Kraftmayer
You should take into account that the cluster is full at a utilization of 95% and no more client requests can be processed. 75% you will see a near_full warning in the ceph status. joachim.kraftma...@clyso.com www.clyso.com Hohenzollernstr. 27, 80801 Munich Utting | HR: Augsburg | HRB: 25

[ceph-users] Re: ceph cluster planning size / disks

2024-11-18 Thread Marc
Thanks! Very good links! :) I need to subtract from the usable capacity max usable/server count to handle 1 server failure. Anything else I need to subtract? > > https://docs.clyso.com/tools/erasure-coding-calculator/ > > > > Am Sa., 16. Nov. 2024 um 10:04 Uhr schrieb Marc Schoechlin > : >

[ceph-users] Re: The effect of changing an osd's class

2024-11-18 Thread Roland Giesler
On 2024/11/17 18:12, Anthony D'Atri wrote: I see 5 OSDs with 0 CRUSH weight, is that intentional? Yes, I  set the weight to 0 to ensure all the pg's are removed from them them since I'm removing them (worn out ssd's) I think I found the problem.  I had created a CRUSH rule called old_ssd (a

[ceph-users] Re: Pacific: mgr loses osd removal queue

2024-11-18 Thread Eugen Block
Hi, thanks for chiming in. I believe there's a slot limit of 10 for the queue, at least I believe I read that some time ago somewhere, so that would explain those 10 parallel drains you mention. I also don't have any such issues on customer clusters, that's why I still suspect the drives...