[ceph-users] Re: Ceph Logging Configuration and "Large omap objects found"

2024-08-12 Thread Janek Bevendorff
Thanks all. ceph log last 10 warn cluster That outputs nothing for me. Any docs about this? I don't have much to comment about logging, I feel you though. I just wanted to point out that the details about the large omap object should be in the (primary) OSD log, not in the MON log: The me

[ceph-users] Re: Ceph Logging Configuration and "Large omap objects found"

2024-08-12 Thread Eugen Block
Hi, ceph log last 10 warn cluster That outputs nothing for me. Any docs about this? not any good docs, I'm afraid. At some point I stumbled across 'ceph log last cephadm' and played around a bit to see what else you can get from that. The help command shows some useful information: log

[ceph-users] Re: Ceph Logging Configuration and "Large omap objects found"

2024-08-12 Thread Janek Bevendorff
That's where the 'ceph log last' commands should help you out, but I don't know why you don't see it, maybe increase the number of lines to display or something? BTW, which ceph version are we talking about here? reef. I tried ceph log last 100 debug cluster and that gives me the usual DBG

[ceph-users] RBD Journaling seemingly getting stuck for some VMs after upgrade to Octopus

2024-08-12 Thread Oliver Freyermuth
Dear Cephalopodians, we've successfully operated a "good old" Mimic cluster with primary RBD images, replicated via journaling to a "backup cluster" with Octopus, for the past years (i.e. one-way replication). We've now finally gotten around upgrading the cluster with the primary images to Oct

[ceph-users] Re: Please guide us inidentifyingthecauseofthedata miss in EC pool

2024-08-12 Thread Best Regards
Hi, Frédéric Thanks for your advice and suggestions. The failure to identify the root cause of the data loss has a certain impact on subsequent improvement measures.min_size does indeed need to be changed to K+1. We will also reevaluate the disaster recovery situation to better handle extreme

[ceph-users] Re: RBD Journaling seemingly getting stuck for some VMs after upgrade to Octopus

2024-08-12 Thread Ilya Dryomov
On Mon, Aug 12, 2024 at 10:20 AM Oliver Freyermuth wrote: > > Dear Cephalopodians, > > we've successfully operated a "good old" Mimic cluster with primary RBD > images, replicated via journaling to a "backup cluster" with Octopus, for the > past years (i.e. one-way replication). > We've now fina

[ceph-users] Re: RBD Journaling seemingly getting stuck for some VMs after upgrade to Octopus

2024-08-12 Thread Oliver Freyermuth
Am 12.08.24 um 11:09 schrieb Ilya Dryomov: On Mon, Aug 12, 2024 at 10:20 AM Oliver Freyermuth wrote: Dear Cephalopodians, we've successfully operated a "good old" Mimic cluster with primary RBD images, replicated via journaling to a "backup cluster" with Octopus, for the past years (i.e. on

[ceph-users] Re: RBD Journaling seemingly getting stuck for some VMs after upgrade to Octopus

2024-08-12 Thread Ilya Dryomov
On Mon, Aug 12, 2024 at 11:28 AM Oliver Freyermuth wrote: > > Am 12.08.24 um 11:09 schrieb Ilya Dryomov: > > On Mon, Aug 12, 2024 at 10:20 AM Oliver Freyermuth > > wrote: > >> > >> Dear Cephalopodians, > >> > >> we've successfully operated a "good old" Mimic cluster with primary RBD > >> images,

[ceph-users] Re: RBD Journaling seemingly getting stuck for some VMs after upgrade to Octopus

2024-08-12 Thread Oliver Freyermuth
Am 12.08.24 um 12:16 schrieb Ilya Dryomov: On Mon, Aug 12, 2024 at 11:28 AM Oliver Freyermuth wrote: Am 12.08.24 um 11:09 schrieb Ilya Dryomov: On Mon, Aug 12, 2024 at 10:20 AM Oliver Freyermuth wrote: Dear Cephalopodians, we've successfully operated a "good old" Mimic cluster with primar

[ceph-users] Re: Ceph Logging Configuration and "Large omap objects found"

2024-08-12 Thread Eugen Block
I just played a bit more with the 'ceph log last' command, it doesn't have a large retention time, the messages get cleared out quickly, I suppose because they haven't changed. I'll take a closer look if and how that can be handled properly. Zitat von Janek Bevendorff : That's where the

[ceph-users] Search for a professional service to audit a CephFS infrastructure

2024-08-12 Thread Fabien Sirjean
Hello, In a professional context, I'm looking for someone with strong CephFS expertise to help us audit our infrastructure. We prefer an on-site audit, but are open to working remotely, and can provide any documentation or information required. Please note that we are not currently in a block

[ceph-users] Re: Identify laggy PGs

2024-08-12 Thread Boris
hmm.. will try that. Thanks Am Sa., 10. Aug. 2024 um 13:33 Uhr schrieb Szabo, Istvan (Agoda) < istvan.sz...@agoda.com>: > We either facing that. > Have a look in the logs reported failed osd. I count the occurrence and > offline compact those, it can help for a while. Normally for us compacting

[ceph-users] Cephadm and the "--data-dir" Argument

2024-08-12 Thread Alex Hussein-Kershaw (HE/HIM)
Hi Folks, I'm trying to use the --data-dir argument of cephadm when bootstrapping a Storage Cluster. It looks like exactly what I need, where my use case is that I want to data files onto a persistent disk, such that I can below away my VMs while retaining the files. Everything looks good and

[ceph-users] Re: Cephadm and the "--data-dir" Argument

2024-08-12 Thread Adam King
Looking through the code it doesn't seem like this will work currently. I found that the --data-dir arg to the cephadm binary was from the initial implementation of the cephadm binary (so early that it was actually called "ceph-daemon" at the time rather than "cephadm") but it doesn't look like tha

[ceph-users] Re: [EXTERNAL] Re: Cephadm and the "--data-dir" Argument

2024-08-12 Thread Alex Hussein-Kershaw (HE/HIM)
Thanks Adam - noted, I expect we can make something else work to meet our needs here. I don't know just how many monsters may be under the bed here - but if it's a fix that's appropriate for someone who doesn't know the Ceph codebase (me) I'd be happy to have a look at implementing a fix. Bes

[ceph-users] Re: [EXTERNAL] Re: Cephadm and the "--data-dir" Argument

2024-08-12 Thread Adam King
I think if it was locked in from bootstrap time it might not be that complicated. We'd just have to store the directory paths in some persistent location the module can access and make ceph cephadm mgr module use them when calling out to the binary for any further actions. This does have the slight

[ceph-users] Important Community Updates [Ceph Developer Summit, Cephalocon]

2024-08-12 Thread Noah Lehman
Hi Ceph community, I have two important updates to share with you about the Ceph Developer Summit and Cephalocon 20204. *Ceph Developer Summit* The Ceph Developer Summit has been extended until August 20th. Find everything you need to know about the event, including the program and other importan

[ceph-users] Stable and fastest ceph version for RBD cluster.

2024-08-12 Thread Özkan Göksu
Hello folks! I built a cluster in 2020 and it has been working great with Nautilus 14.2.16 for the past 4 years. I have 1000++ RBD drives for VM's running on Samsung MZ7LH3T8HMLT drives. Now I want to upgrade the ceph version with a fresh installation and I want to take your opinion on which vers

[ceph-users] Re: Stable and fastest ceph version for RBD cluster.

2024-08-12 Thread Mark Nelson
Hi Özkan, I've written a couple of articles that might be helpful: https://ceph.io/en/news/blog/2023/reef-osds-per-nvme/ https://ceph.io/en/news/blog/2023/reef-freeze-rbd-performance/ https://ceph.io/en/news/blog/2023/reef-freeze-rgw-performance/ https://ceph.io/en/news/blog/2024/ceph-a-journey-

[ceph-users] Re: squid 19.1.1 RC QE validation status

2024-08-12 Thread Laura Flores
Hey @Adam King , can you take a look at this tracker? https://tracker.ceph.com/issues/66883#note-26 I summarized the full issue in the last note. I believe it is an orch problem blocking the upgrade tests, and I would like to hear your thoughts. On Fri, Aug 9, 2024 at 9:14 AM Adam King wrote: >