[ceph-users] Re: Help with "27 osd(s) are not reachable" when also "27 osds: 27 up.. 27 in"

2024-10-16 Thread Harry G Coin
Hi Frédéric All was normal in v18, after 19.2 the problem remains even though the addresses are different: cluster_network global: fc00:1000:0:b00::/64 public_network global: fc00:1002:c7::/64 Also, after rebooting everything in sequence, it only complains that the 27 osd that are both up,

[ceph-users] Re: cephadm bootstrap ignoring --skip-firewalld

2024-10-16 Thread Kozakis, Anestis
Hi Adam, That might explain it. It was installed via dnf on a Fedora 40 machine. Anestis Kozakis Systems Administrator - Multi-Level Security Solutions P: + 61 2 6122 0205 M: +61 4 88 376 339 anestis.koza...@raytheon.com.au Raytheon Australia Cybersecu

[ceph-users] Re: cephadm bootstrap ignoring --skip-firewalld

2024-10-16 Thread Adam King
Where did the copy of cephadm you're using for the bootstrap come from? I'm aware of a bug around that flag (https://tracker.ceph.com/issues/54137) but that fix should have come in some time ago. I've seen some people, especially if they're using the distros version of the cephadm package, end up w

[ceph-users] Re: cephadm bootstrap ignoring --skip-firewalld

2024-10-16 Thread Kozakis, Anestis
As per below I sent this a few weeks ago but didn't get a response from anyone. Does anyone have any advice/help or a solution to the issue where cephadm bootstrap ignores the --skip-firewalld option? Anestis Kozakis Systems Administrator  - Multi-Level Security Solutions P: + 61 2 6122 0205 M:

[ceph-users] Re: Ceph orchestrator not refreshing device list

2024-10-16 Thread Eugen Block
Glad to hear it worked out for you! Zitat von Bob Gibson : I’ve been away on vacation and just got back to this. I’m happy to report that manually recreating the OSD with ceph-volume and then adopting it with cephadm fixed the problem. Thanks again for your help Eugen! Cheers, /rjg On S

[ceph-users] Re: Ceph orchestrator not refreshing device list

2024-10-16 Thread Bob Gibson
I’ve been away on vacation and just got back to this. I’m happy to report that manually recreating the OSD with ceph-volume and then adopting it with cephadm fixed the problem. Thanks again for your help Eugen! Cheers, /rjg > On Sep 29, 2024, at 10:40 AM, Eugen Block wrote: > > EXTERNAL EMAI

[ceph-users] Re: Reef osd_memory_target and swapping

2024-10-16 Thread Anthony D'Atri
> Unfortunately, its not quite that simple. At least until mimic, but > potentially later too there was this behavior that either the OSD's allocator > did not release or the kernel did not reclaim unused pages if there was > sufficient total memory available. Which implied pointless swapping.

[ceph-users] Re: Ubuntu 24.02 LTS Ceph status warning

2024-10-16 Thread David Orman
https://bugs.launchpad.net/ubuntu/+source/libpod/+bug/2040483 https://bugs.launchpad.net/ubuntu/+source/containerd-app/+bug/2065423 I wonder if you're running into fallout from the above bug. I believe a fix should be rolling out soon, according to those bugs. We ran into a multitude of seemingl

[ceph-users] Re: Ubuntu 24.02 LTS Ceph status warning

2024-10-16 Thread Eugen Block
Is apparmor configured differently on those hosts? Or is it running only on the misbehaving host? Zitat von Dominique Ramaekers : 'ceph config get mgr container_image' gives quay.io/ceph/ceph@sha256:200087c35811bf28e8a8073b15fa86c07cce85c575f1ccd62d1d6ddbfdc6770a => OK 'ceph health detail

[ceph-users] Re: Ubuntu 24.02 LTS Ceph status warning

2024-10-16 Thread Eugen Block
Is apparmor configured differently on those hosts? Or is it running only on the misbehaving host? Zitat von Dominique Ramaekers : 'ceph config get mgr container_image' gives quay.io/ceph/ceph@sha256:200087c35811bf28e8a8073b15fa86c07cce85c575f1ccd62d1d6ddbfdc6770a => OK 'ceph health detail

[ceph-users] Re: Reef osd_memory_target and swapping

2024-10-16 Thread Frank Schilder
> If you set appropriate OSD memory targets, set kernel swapiness to > something like 10-20, and properly pin your OSDs in a system with >1 NUMA > node so that they're evenly distributed across NUMA nodes, your kernel will > not swap because it simply has no reason to. Unfortunately, its not quite

[ceph-users] Re: Ubuntu 24.02 LTS Ceph status warning

2024-10-16 Thread Dominique Ramaekers
I pulled the V19 image and cleaned up the not used images. The problem remains. So on host hvs004 I entered the commands 'cephadm gather-facts' and 'cephadm ceph-volume lvm list' and I got sensible output without error. Can it be a Python issue regarding coding of json input? But why this happens

[ceph-users] Re: "ceph orch" not working anymore

2024-10-16 Thread Malte Stroem
Hi Laimis, that did not work. Still ceph orch does not work. Best, Malte On 16.10.24 14:12, Malte Stroem wrote: Thank you, Laimis. And you got the same error message? That's strange. In the mean time I try to check for clients connected. No Kubernetes and CephFS, but RGWs. Best, Malte On

[ceph-users] Re: osd won't start

2024-10-16 Thread Erwin Bogaard
Thanks, I set the logging to 20 for all three, and had to hard-kill the VM (ceph-osd won't stop in any way). Now the log ends in the following: 2024-10-16T14:23:11.855+0200 7f276dc0d640 20 bluefs _replay 0x119f5: op_dir_link db/019175.sst to 18791 2024-10-16T14:23:11.855+0200 7f276dc0d640 20 b

[ceph-users] Re: osd won't start

2024-10-16 Thread Igor Fedotov
Hi Erwin, you might want to increase OSD logging level to see what's happening. I would suggest set debug-bdev, debug-bluefs and debug-bluestore to 10 (or even 20). But be cautious - this can result in a huge log... Thanks, Igor On 10/16/2024 3:01 PM, Erwin Bogaard wrote: Hi, we're expe

[ceph-users] Re: "ceph orch" not working anymore

2024-10-16 Thread Malte Stroem
Thank you, Laimis. And you got the same error message? That's strange. In the mean time I try to check for clients connected. No Kubernetes and CephFS, but RGWs. Best, Marc On 16.10.24 14:01, Laimis Juzeliūnas wrote: Hi Malte, We have faced this recently when upgrading to Squid from latest

[ceph-users] Re: "ceph orch" not working anymore

2024-10-16 Thread Laimis Juzeliūnas
Hi Malte, We have faced this recently when upgrading to Squid from latest Reef. As a temporary workaround we disabled the balancer with ‘ceph balancer off’ and restarted mgr daemons. We are suspecting older clients (from Kubernetes RBD mounts as well as CephFS mounts) on servers with incompatib

[ceph-users] osd won't start

2024-10-16 Thread Erwin Bogaard
Hi, we're experiencing issues with a few osd's. They had a crash, but now won't start anymore. Nothing seems wrong with them, but they keep hanging with apparent 100% i/o wait on the machine when starting the osd. This is on ceph 18.2.4 This is the log (edited a bit, as it's too long): 2024-10-1

[ceph-users] "ceph orch" not working anymore

2024-10-16 Thread Malte Stroem
Hi there, I have a three node cluster with latest Ceph 18 and cephadm as orchestrator. One node is dead. So the cluster is not healthy but working. Now: Working: cephadm ls or other commands with cephadm... However: ceph orch... does not work at all: Error ENOENT: No orchestrator confi

[ceph-users] Re: Reef osd_memory_target and swapping

2024-10-16 Thread Tyler Stachecki
On Tue, Oct 15, 2024, 1:38 PM Anthony D'Atri wrote: > > > > On Oct 15, 2024, at 1:06 PM, Dave Hall wrote: > > > > Hello. > > > > I'm seeing the following in the Dashboard -> Configuration panel > > for osd_memory_target: > > > > Default: > > 4294967296 > > > > Current Values: > > osd: 979765943

[ceph-users] Re: Help with "27 osd(s) are not reachable" when also "27 osds: 27 up.. 27 in"

2024-10-16 Thread Frédéric Nass
Hi Harry, Do you have a 'cluster_network' set to the same subnet as the 'public_network' like in the issue [1]? Doesn't make much sens setting up a cluster_network when it's not different than the public_network. Maybe that's what triggers the OSD_UNREACHABLE recently coded here [2] (even thoug