[ceph-users] Re: Bug with Cephadm module osd service preventing orchestrator start

2024-08-19 Thread Eugen Block
Hi, what is the output of this command? ceph config-key get mgr/cephadm/osd_remove_queue I just tried to cancel a draining on a small 18.2.4 test cluster, it went well, though. After scheduling the drain the mentioned key looks like this: # ceph config-key get mgr/cephadm/osd_remove_queue

[ceph-users] Re: The snaptrim queue of PGs has not decreased for several days.

2024-08-19 Thread Giovanna Ratini
Hello Eugen, *root@kube-master02:~# k ceph -s* Info: running 'ceph' command with args: [-s]   cluster:     id: 3a35629a-6129-4daf-9db6-36e0eda637c7     health: HEALTH_WARN     32 pgs not deep-scrubbed in time     32 pgs not scrubbed in time   services:     mon: 3 daemons, qu

[ceph-users] Re: memory leak in mds?

2024-08-19 Thread Dario Graña
Thank you Frédéric and Venky for your answers. I will try to do some tests before changing the production environment. On Mon, Aug 19, 2024 at 8:53 AM Venky Shankar wrote: > [cc Xiubo] > > On Fri, Aug 16, 2024 at 8:10 PM Dario Graña wrote: > > > > Hi all, > > We’re experiencing an issue with Ce

[ceph-users] Re: memory leak in mds?

2024-08-19 Thread Dario Graña
I was testing to downgrade the *ceph-common* package for a client with alma9, the same OS we use in production. I was trying to install the *ceph-common-17.2.6* package since the troubles began in 17.2.7. But I face a dependency problem nothing provides libthrift-0.14.0.so()(64bit) needed by ceph-c

[ceph-users] Re: squid 19.1.1 RC QE validation status

2024-08-19 Thread Venky Shankar
Hi Brad, On Fri, Aug 16, 2024 at 8:59 AM Brad Hubbard wrote: > > On Thu, Aug 15, 2024 at 11:50 AM Brad Hubbard wrote: > > > > On Tue, Aug 6, 2024 at 6:33 AM Yuri Weinstein wrote: > > > > > > Details of this release are summarized here: > > > > > > https://tracker.ceph.com/issues/67340#note-1 >

[ceph-users] Prometheus and "404" error on console

2024-08-19 Thread Tim Holloway
Although I'm seeing this in Pacific, it appears to be a perennial issue with no well-documented solution. The dashboard home screen is flooded with popups saying "404 - Not Found Could not reach Prometheus's API on http://ceph1234.mydomain.com:9095/api/v1 " If I was a slack-jawed PHB casually wan

[ceph-users] Re: Prometheus and "404" error on console

2024-08-19 Thread Daniel Brown
I’ve seen similar. Have been wondering if it would be possible to either setup a LoadBalancer or something like “keeepalived” to provide a “VIP” which could move between nodes to support the dashboard (and prometheus, Grafana, etc). I do see notes about HA Proxy in the docs, but haven’t gott

[ceph-users] Re: The snaptrim queue of PGs has not decreased for several days.

2024-08-19 Thread Eugen Block
What happens when you disable snaptrimming entirely? ceph osd set nosnaptrim So the load on your cluster seems low, but are the OSDs heavily utilized? Have you checked iostat? Zitat von Giovanna Ratini : Hello Eugen, *root@kube-master02:~# k ceph -s* Info: running 'ceph' command with arg

[ceph-users] cephadm module fails to load with "got an unexpected keyword argument"

2024-08-19 Thread Alex Sanderson
Hi everyone, I recently upgraded from Quincy to Reef v18.2.4 and my dashboard and mgr systems have been broken since.  Since the upgrade I was slowly removing and zapping osd's that still had the 64k "bluestore_bdev_block_size" and decided to have a look at the dashboard problem.   I restarted

[ceph-users] Re: cephadm module fails to load with "got an unexpected keyword argument"

2024-08-19 Thread Eugen Block
Hi, there's a tracker issue [0] for that. I was assisting with the same issue in a different thread [1]. Thanks, Eugen [0] https://tracker.ceph.com/issues/67329 [1] https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/SRJPC5ZYTPXF63AKGIIOA2LLLBBWCIT4/ Zitat von Alex Sanderson

[ceph-users] Re: Bug with Cephadm module osd service preventing orchestrator start

2024-08-19 Thread Eugen Block
There's a tracker issue for this: https://tracker.ceph.com/issues/67329 Zitat von Eugen Block : Hi, what is the output of this command? ceph config-key get mgr/cephadm/osd_remove_queue I just tried to cancel a draining on a small 18.2.4 test cluster, it went well, though. After scheduling

[ceph-users] Re: The snaptrim queue of PGs has not decreased for several days.

2024-08-19 Thread Giovanna Ratini
Hallo Eugen, yes, the load is for now not too much. I stop the snap and now this is the output. No changes in the queue. root@kube-master02:~# k ceph -s Info: running 'ceph' command with args: [-s]   cluster:     id: 3a35629a-6129-4daf-9db6-36e0eda637c7     health: HEALTH_WARN     n

[ceph-users] Re: The snaptrim queue of PGs has not decreased for several days.

2024-08-19 Thread Eugen Block
There's a lengthy thread [0] where several approaches are proposed. The worst is a OSD recreation, but that's the last resort, of course. What's are the current values for these configs? ceph config get osd osd_pg_max_concurrent_snap_trims ceph config get osd osd_max_trimming_pgs Maybe decrea

[ceph-users] Re: squid release codename

2024-08-19 Thread Yehuda Sadeh-Weinraub
On Sat, Aug 17, 2024 at 9:12 AM Anthony D'Atri wrote: > > > It's going to wreak havoc on search engines that can't tell when > > someone's looking up Ceph versus the long-establish Squid Proxy. > > Search engines are way smarter than that, and I daresay that people are far > more likely to search

[ceph-users] Re: squid release codename

2024-08-19 Thread Anthony D'Atri
> On Aug 19, 2024, at 9:45 AM, Yehuda Sadeh-Weinraub wrote: > > Originally I remember also suggesting "banana" (after bananaslug) [1] , > imagine how much worse it could have been. Solidigm could have been Stodesic or Velostate ;) ___ ceph-users

[ceph-users] Re: The snaptrim queue of PGs has not decreased for several days.

2024-08-19 Thread Giovanna Ratini
Hello Eugen, root@kube-master02:~# k ceph config get osd osd_pg_max_concurrent_snap_trims Info: running 'ceph' command with args: [config get osd osd_pg_max_concurrent_snap_trims] 2 root@kube-master02:~# k ceph config get osd osd_max_trimming_pgs Info: running 'ceph' command with args: [config

[ceph-users] Re: squid 19.1.1 RC QE validation status

2024-08-19 Thread Adam King
https://tracker.ceph.com/issues/67583 didn't reproduce across 10 reruns https://pulpito.ceph.com/lflores-2024-08-16_00:04:51-upgrade:quincy-x-squid-release-distro-default-smithi/. Given the original failure was just "Unable to find image ' quay.io/ceph/grafana:9.4.12' locally" which doesn't look ve

[ceph-users] Re: squid 19.1.1 RC QE validation status

2024-08-19 Thread Laura Flores
Thanks @Adam King ! @Yuri Weinstein the upgrade suites are approved. On Mon, Aug 19, 2024 at 9:28 AM Adam King wrote: > https://tracker.ceph.com/issues/67583 didn't reproduce across 10 reruns > https://pulpito.ceph.com/lflores-2024-08-16_00:04:51-upgrade:quincy-x-squid-release-distro-default-s

[ceph-users] Re: squid 19.1.1 RC QE validation status

2024-08-19 Thread Yuri Weinstein
We need approval from Guillaume Laura, and gibba upgrade. On Mon, Aug 19, 2024 at 7:31 AM Laura Flores wrote: > Thanks @Adam King ! > > @Yuri Weinstein the upgrade suites are approved. > > On Mon, Aug 19, 2024 at 9:28 AM Adam King wrote: > >> https://tracker.ceph.com/issues/67583 didn't repro

[ceph-users] Re: squid 19.1.1 RC QE validation status

2024-08-19 Thread Laura Flores
I can do the gibba upgrade after everything's approved. On Mon, Aug 19, 2024 at 9:47 AM Yuri Weinstein wrote: > We need approval from Guillaume > > Laura, and gibba upgrade. > > On Mon, Aug 19, 2024 at 7:31 AM Laura Flores wrote: > >> Thanks @Adam King ! >> >> @Yuri Weinstein the upgrade suite

[ceph-users] CLT meeting notes August 19th 2024

2024-08-19 Thread Adam King
- [travisn] Arm64 OSDs crashing on v18.2.4, need a fix in v18.2.5 - https://tracker.ceph.com/issues/67213 - tcmalloc issue, solved by rebuilding the gperftools package - Travis to reach out to Rongqi Sun about the issue - moving away from tcmalloc would probably cause perform

[ceph-users] Re: Prometheus and "404" error on console

2024-08-19 Thread Tim Holloway
Since I use keepalived, I can affirm with virtual certainty that keepalived could do stuff like that. Although it may involve using special IP address that keepalived would aim at the preferred server instance. But that's not the problem here, as "404" means that the server is up, but it sneers at

[ceph-users] Re: Bug with Cephadm module osd service preventing orchestrator start

2024-08-19 Thread Benjamin Huth
Thank you so much for the help! Thanks to the issue you linked and the other guy you replied to with the same issue, I was able to edit the config-key and get my orchestrator back. Sorry for not checking the issues as well as I should have, that's my bad there. On Mon, Aug 19, 2024 at 6:12 AM Euge

[ceph-users] Re: The snaptrim queue of PGs has not decreased for several days.

2024-08-19 Thread Giovanna Ratini
Hello Eugen, yesterday after stop and go of snaptrim the queue decrease a little and then remain blocked. They didn't grow and didn't decrease. Is that good or bad? Am 19.08.2024 um 15:43 schrieb Eugen Block: There's a lengthy thread [0] where several approaches are proposed. The worst is a