Hi,
I haven't done this in production yet either, but in a test cluster I
threw away that config-key and it just gets regenerated. So I suppose
one could try that without any bis risk.
Just a note, this should also work (get instead of dump):
ceph config-key get mgr/cephadm/host.ceph-osd31.
Hi all
We deployed successfully a stretched cluster and all is working fine. But is it
possible to assign the active MDS services in one DC and the standby-replay in
the other?
We're running 18.2.4, deployed via cephadm. Using 4 MDS servers with 2 active
MDS on pinnend ranks and 2 in standby-re
This is a common error on my system (Pacific).
It appears that there is internal confusion as to where the crash
support stuff lives - whether it's new-style (administered and under
/var/lib/ceph/fsid) or legacy style (/var/lib/ceph). One way to fake it
out was to manually created a minimal c
No, unfortunately this needs to be done at a higher level and is not
included in Ceph right now. Rook may be able to do this, but I don't think
cephadm does.
Adam, is there some way to finagle this with pod placement rules (ie,
tagging nodes as mds and mds-standby, and then assigning special mds co
Yes, with Rook this is possible by adding zone anti-affinity for the MDS
pods.
Travis
On Tue, Oct 29, 2024 at 3:35 PM Gregory Farnum wrote:
> No, unfortunately this needs to be done at a higher level and is not
> included in Ceph right now. Rook may be able to do this, but I don't think
> cepha
I was running into that as well. Setting
`osd_mclock_override_recovery_settings` [1] to true allowed me to manage
osd_max_backfills again and get recovery to start happening again. It's on
my todo list to understand mclock profiles, but resizing PGs was a
nightmare with it. Changing to override the
Take care when reading the output of "ceph osd metadata". When you are
running the OSD as an administered service, it's running in a container,
and a container is a miniature VM. So, for example, it may report your
OS as "CentOS Stream 8" even if your actual machine is running Ubuntu.
The big
Hey Cephers,
i was investigating some other issue, when I stumbled across this. I am
not sure, if this is "as intended" or faulty. This is a cephadm cluster
on reef 18.2.4, containerized with docker.
The ceph-crash module states that it cant find its key and that it cant
access RADOS.
Pre-
Hi,
I'm not aware of any service settings that would allow that.
You'll have to monitor each MDS state and restart any non-local active MDSs to
reverse roles.
Regards,
Frédéric.
- Le 29 Oct 24, à 14:06, Sake Ceph c...@paulusma.eu a écrit :
> Hi all
> We deployed successfully a stretched c
I hope someone of the development team can share some light on this. Will
search the tracker if some else made a request about this.
> Op 29-10-2024 16:02 CET schreef Frédéric Nass
> :
>
>
> Hi,
>
> I'm not aware of any service settings that would allow that.
>
> You'll have to monitor eac
Tim,
Thank you for your guidance. Your points are completely understood. It
was more that I couldn't figure out why the Dashboard was telling me that
the destroyed OSD was still using /dev/sdi when the physical disk with that
serial number was at /dev/sdc, and when another OSD was also reporting
But you don't get to choose which one is active and which one is standby, as
these are states that permute over time, not configurations, or do you?
I mean there's no way to tell Rook 'i want this one to be active preferably'
and have Rook operator monitor MDSs and restart the non-local one if
Hi all,
We used 'rados bench' to test 4k object read and write operations.
Our cluster is pacific, one node, 11 bluestore osd ,db and wal share the block
device. Block device is HDD.
1. testing 4k write with command 'rados bench 120 write -t 16 -b 4K -p
rep3datapool --run-name 4kreadwrite --n
The good Mr. Nelson and others may have more to contribute, but a few thoughts:
* Running for 60 or 120 seconds isn’t quantitative: rados bench typically
exhibits a clear ramp-up; watch the per-second stats.
* Suggest running for 10 minutes, three times in a row and averaging the results
* How m
rep3datapool pg num is 512, Average number of PG replicas per OSD is 139
scrubs, balancer and pg autoscaler was disabled
RAM is 128G, swap is 0
From: Anthony D'Atri
Date: 2024-10-30 12:03
To: Louisa
CC: ceph-users
Subject: Re: [ceph-users] why performance difference between 'rados bench seq'
and '
15 matches
Mail list logo