[ceph-users] Re: Error ENOENT: Module not found - ceph orch commands stoppd working

2024-11-12 Thread Torkil Svensgaard
On 12-11-2024 09:29, Eugen Block wrote: Hi Torkil, Hi Eugen this sounds suspiciously like https://tracker.ceph.com/issues/67329 Do you have the same (or similar) stack trace in the mgr log pointing to osd_remove_queue? You seem to have removed some OSDs, that would fit the description as we

[ceph-users] Re: Ceph Reef 16 pgs not deep scrub and scrub

2024-11-12 Thread Eugen Block
Hi, what's your cluster status? Is there recovery/remapping going on? That can block (deep-)scrubs which you can allow during recovery (osd_scrub_during_recovery). There have been several threads about this warning on this list, for example [0]. You should find plenty of information in th

[ceph-users] Error ENOENT: Module not found - ceph orch commands stoppd working

2024-11-12 Thread Torkil Svensgaard
Hi 18.2.4. After failing over the active manager ceph orch commands seems to have stopped working. There's this in the mgr log: " 2024-11-12T08:16:30.136+ 7f1b2d887640 0 log_channel(audit) log [DBG] : from='client.2088861125 -' entity='client.admin' cmd=[{"prefix": "orch osd rm status"

[ceph-users] Re: Error ENOENT: Module not found - ceph orch commands stoppd working

2024-11-12 Thread Eugen Block
Hi Torkil, this sounds suspiciously like https://tracker.ceph.com/issues/67329 Do you have the same (or similar) stack trace in the mgr log pointing to osd_remove_queue? You seem to have removed some OSDs, that would fit the description as well... Regards, Eugen Zitat von Torkil Svensgaard

[ceph-users] Re: Cephadm Drive upgrade process

2024-11-12 Thread Eugen Block
Hi, I would be nice if we could just copy the content to the new drive and go from there. that's exactly what we usually do, we add a new drive and 'pvmove' the contents of the failing drive. The worst thing so far is that the orchestrator still thinks it's /dev/sd{previous_letter}, but I

[ceph-users] Re: Error ENOENT: Module not found - ceph orch commands stoppd working

2024-11-12 Thread Eugen Block
I think the Reef backport will be available in the next point release (18.2.5). Squid should already have it, if I'm not mistaken. But I'm not sure if you want to upgrade just to mitigate this issue. You can extract the faulty key: ceph config-key get mgr/cephadm/osd_remove_queue > osd_remove

[ceph-users] Re: Move block.db to new ssd

2024-11-12 Thread Frédéric Nass
Yep, we're using RocksDB compression with Pacific since a few month. It helped a lot. Since we're talking overspilling... Despite using bluestore_volume_selection_policy=use_some_extra with resharded RocksDB databases we can still observe many OSDs overspilling from time to time (approximatel

[ceph-users] Re: Move block.db to new ssd

2024-11-12 Thread Anthony D'Atri
Yes, it improves the dynamic where only ~3, 30, 300, etc. GB of DB space can be used, and thus mitigates spillover. Previously a, say, 29GB DB device/partition would be like 85% unused. With recent releases one can also turn on DB compression, which should have a similar benefit. > On Nov 12,

[ceph-users] Re: Move block.db to new ssd

2024-11-12 Thread Alexander Patrakov
Hello Frédéric, The advice regarding 30/300 GB DB sizes is no longer valid. Since Ceph 15.2.8, due to the new default (bluestore_volume_selection_policy = use_some_extra), it no longer wastes the extra capacity of the DB device. On Tue, Nov 12, 2024 at 5:52 PM Frédéric Nass wrote: > > > > -

[ceph-users] Re: Strange container restarts?

2024-11-12 Thread Eugen Block
I don't see osd related exec_died messages in Pacific, but on Quincy they are also logged. But I can simply trigger it with a 'cephadm ls', so it's just the regular check, no need to worry about that. It's not triggered though if you only run 'cephadm ls --no-detail', but one would have to

[ceph-users] Re: Error ENOENT: Module not found - ceph orch commands stoppd working

2024-11-12 Thread Torkil Svensgaard
On 12-11-2024 09:55, Eugen Block wrote: I think the Reef backport will be available in the next point release (18.2.5). Squid should already have it, if I'm not mistaken. But I'm not sure if you want to upgrade just to mitigate this issue. You can extract the faulty key: ceph config-key get

[ceph-users] Re: Move block.db to new ssd

2024-11-12 Thread Roland Giesler
On 2024/11/12 04:54, Alwin Antreich wrote: Hi Roland, On Mon, Nov 11, 2024, 20:16 Roland Giesler wrote: I have ceph 17.2.6 on a proxmox cluster and want to replace some ssd's who are end of life. I have some spinners who have their journals on SSD. Each spinner has a 50GB SSD LVM partition

[ceph-users] Re: 9 out of 11 missing shards of shadow object in ERC 8:3 pool.

2024-11-12 Thread Eugen Block
Hi Robert, thanks for the update, it's great that the issue is resolved. Zitat von Robert Kihlberg : Thanks Josh and Eugen, I did not manage to trace this object to an S3 object. Instead I read all files in the suspected S3 bucket and actually hit a bad one. Since we had a known good mirror I

[ceph-users] Re: Move block.db to new ssd

2024-11-12 Thread Anthony D'Atri
___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Move block.db to new ssd

2024-11-12 Thread Frédéric Nass
- Le 12 Nov 24, à 8:51, Roland Giesler rol...@giesler.za.net a écrit : > On 2024/11/12 04:54, Alwin Antreich wrote: >> Hi Roland, >> >> On Mon, Nov 11, 2024, 20:16 Roland Giesler wrote: >> >>> I have ceph 17.2.6 on a proxmox cluster and want to replace some ssd's >>> who are end of life. I

[ceph-users] Re: Move block.db to new ssd

2024-11-12 Thread Frédéric Nass
Hello Alexander, Thank you for clarifying this point. The documentation was not very clear about the 'improvements'. Does that mean that in the latest releases overspilling no longer occurs between the two thresholds of 30GB and 300GB? Meaning block.db can be 80GB in size without overspilling,

[ceph-users] Re: Move block.db to new ssd

2024-11-12 Thread Alexander Patrakov
Yes, that is correct. On Tue, Nov 12, 2024 at 8:51 PM Frédéric Nass wrote: > > Hello Alexander, > > Thank you for clarifying this point. The documentation was not very clear > about the 'improvements'. > > Does that mean that in the latest releases overspilling no longer occurs > between the tw

[ceph-users] Re: Move block.db to new ssd

2024-11-12 Thread Frédéric Nass
Hi Anthony, Did the RocksDB sharding end up improving the overspilling situation related to the level thresholds? I had only anticipated that it would reduce the impact of compaction. We reshared our OSD's RocksDBs a long time ago (after upgrading to Pacific IIRC) and I think we could still