[ceph-users] Re: How to speed up OSD deployment process

2024-11-08 Thread Tim Holloway
We're not optimistic. It's possible that Ceph will run this stuff in serial no matter how you submit it. But here's some useful stuff on getting Ansible to run things in parallel: https://toptechtips.github.io/2023-06-26-ansible-parallel/ https://docs.ansible.com/ansible/latest/playbook_guide/pl

[ceph-users] Re: How to speed up OSD deployment process

2024-11-08 Thread YuFan Chen
Hi Tim, Yes, I use Ansible and osd spec yaml to deploy ceph custer. And I use lvm to manage these HDDs and nvme SSDs the osd spec is like that: --- service_type: osd service_id: osd.hybrid placement: label: 'osd' data_devices: paths: - /dev/hybrid/bdev01 ... - /dev/hybrid/bdev32 d

[ceph-users] multisite sync issue with bucket sync

2024-11-08 Thread Christopher Durham
I have a 2-site multisite configuration on cdnh 18.2.4 on EL9. After system updates, we discovered that a particular bucket had several thousand objects missing, which the other side had. Newly created objects were being replicated just fine. I decided to 'restart' syncing that bucket. Here is

[ceph-users] Re: quincy v17.2.8 QE Validation status

2024-11-08 Thread Laura Flores
I finished reviewing the rados runs: https://tracker.ceph.com/projects/rados/wiki/QUINCY#httpstrackercephcomissues68643 @Adam King I filed a new tracker, https://tracker.ceph.com/issues/68880, for the orchestrator. It looks like a test issue, but can you take a look? I am checking with @Radoslaw

[ceph-users] Re: How to speed up OSD deployment process

2024-11-08 Thread Tim Holloway
I've worked with systems much smaller than that where I would have LOVED to get everything up in only an hour. Kids these days. 1. Have you tried using a spec file? Might help, might not. 2. You could always do the old "&" Unix shell operator for asynchronous commands. I think you could get An

[ceph-users] Re: failed to load OSD map for epoch 2898146, got 0 bytes

2024-11-08 Thread Frank Schilder
Hi Dan, I have collected a 134M log file (11M compressed) of the startup with debug_osd=20/20. Do you have access to the upload area of the ceph-devs (the ceph-post-file destination)? If not, any preferred way I can send it to you? To execute the ceph-objectstore-tool mount command it looks lik

[ceph-users] Re: OSD META Capacity issue of rgw ceph cluster

2024-11-08 Thread Frédéric Nass
You may be facing this "BlueFS files take too much space" bug [1]. Have a look at the figures in the PR [2]. Fix hasn't been merged to Reef yet. Regards, Frédéric. [1] [ https://tracker.ceph.com/issues/68385 | https://tracker.ceph.com/issues/68385 ] [2] [ https://github.com/ceph/ceph/pull

[ceph-users] Re: 1 stray daemon(s) not managed by cephadm

2024-11-08 Thread Eugen Block
Tim, not everything is a double osd ;-) The message is about a stray mon daemon on a host called ceph-osd3. Check out ‘ceph orch ls mon --export’ to see where the monitors are supposed to be running. Was that an adopted cluster or freshly built with cephadm? Is a different mon daemon running s

[ceph-users] How to speed up OSD deployment process

2024-11-08 Thread YuFan Chen
Hi, I’m setting up a 6-node Ceph cluster using Ceph Squid. Each node is configured with 32 OSDs (32 HDDs and 8 NVMe SSDs for db_devices). I’ve created an OSD service specification and am using cephadm to apply the configuration. The deployment of all 192 OSDs takes about an hour to complete. How

[ceph-users] Re: 1 stray daemon(s) not managed by cephadm

2024-11-08 Thread Tim Holloway
Check the /var/lib/ceph directory on host ceph-osd3. If there is an osd.3 directory there, and a /var/lib/ceph/{fsid}/osd.3 directory then you are a member of the schizophrenic OSD club. Congratulations, your membership badge and certificate of Membership will be arriving shortly. I think you

[ceph-users] Strange container restarts?

2024-11-08 Thread Jan Marek
Hello, we have ceph cluster which consists of 12 host, on every host we have 12 NVMe "disks". On most of these host (9 of 12) we have in logs errors, see attached file. We tried to check this problem, and we have these points: 1) On every host there is only one OSD. Thus it's not problem in ver

[ceph-users] Re: quincy v17.2.8 QE Validation status

2024-11-08 Thread Nizamudeen A
Hi Yuri, I see this is failing on jenkins check https://github.com/ceph/ceph/pull/59142 So we can ignore that and cherry pick the other 2 dashboard PRs for now since they are the important ones. Regards, Nizam On Thu, 7 Nov 2024, 03:13 Yuri Weinstein, wrote: > Hi Nizam > > https://tracker.cep

[ceph-users] Re: osd removal leaves 'stray daemon'

2024-11-08 Thread Frédéric Nass
Hi, I added some more logs to the bug tracker [1]. Could this be related to the 60s (hard coded) limit in def _check_for_strays(self) [2]? Regards, Frédéric. [1] https://tracker.ceph.com/issues/67018 [2] https://github.com/ceph/ceph/blob/f55fc4599a6c0da0f4bd2f3ecd2122e603ad94dd/src/pybind/mgr/