[ceph-users] Re: 6.5 CephFS client - ceph_cap_reclaim_work [ceph] / ceph_con_workfn [libceph] hogged CPU

2023-09-13 Thread Stefan Kooman
On 14-09-2023 03:27, Xiubo Li wrote: < - snip --> Hi Stefan, Yeah, as I remembered before I have seen something like this only once in the cephfs qa tests together with other issues, but I just thought it wasn't the root cause so I didn't spent time on it. Just went through the k

[ceph-users] Re: Rebuilding data resiliency after adding new OSD's stuck for so long at 5%

2023-09-13 Thread Sake
Which version do you use? Quincy has currently incorrect values for it's new IOPS scheduler, this will be fixed in the next release (hopefully soon). But there are workaround, please check the mailing list about this, I'm in a hurry so can't point directly to the correct post. Best regards, SakeOn

[ceph-users] Re: Rebuilding data resiliency after adding new OSD's stuck for so long at 5%

2023-09-13 Thread sharathvuthpala
Hi, We have HDD disks. Today, after almost 36 hours, Rebuilding Data Resiliency is 58% and still going on. The good thing is it is not stuck at 5%. Does it take this long to complete rebuilding resiliency process whenever there is a maintenance in the cluster? ___

[ceph-users] Re: CEPH zero iops after upgrade to Reef and manual read balancer

2023-09-13 Thread Mosharaf Hossain
Hello Laura I have created the tracker and you can find it on https://tracker.ceph.com/issues/62836 Please find the OSD map as below on the cluster *root@cph2n1:/# ceph -s* cluster: id: e5f5ec6e-0b1b-11ec-adc5-35a84c0db1fb health: HEALTH_WARN 65 pgs not deep-scrubbed in t

[ceph-users] Re: 6.5 CephFS client - ceph_cap_reclaim_work [ceph] / ceph_con_workfn [libceph] hogged CPU

2023-09-13 Thread Xiubo Li
On 9/13/23 20:58, Ilya Dryomov wrote: On Wed, Sep 13, 2023 at 9:20 AM Stefan Kooman wrote: Hi, Since the 6.5 kernel addressed the issue with regards to regression in the readahead handling code... we went ahead and installed this kernel for a couple of mail / web clusters (Ubuntu 6.5.1-060501

[ceph-users] Re: CEPH zero iops after upgrade to Reef and manual read balancer

2023-09-13 Thread Laura Flores
Link the tracker on this list if you have it. You can create one under the RADOS project: https://tracker.ceph.com/projects/rados Thanks, Laura On Wed, Sep 13, 2023 at 4:35 PM Laura Flores wrote: > Hi Mosharaf, > > Can you please create a tracker issue and attach a copy of your osdmap? > Also,

[ceph-users] Re: Awful new dashboard in Reef

2023-09-13 Thread Marc
Hmmm, I think I like this capacity card, much better than the one I am currently using ;) > > We have some screenshots in a blog post we did a while back: > https://ceph.io/en/news/blog/2023/landing-page/ > and also in the documentation: > https://docs.ceph.com/en/latest/mgr/dashboard/#overview-

[ceph-users] Re: CEPH zero iops after upgrade to Reef and manual read balancer

2023-09-13 Thread Laura Flores
Hi Mosharaf, Can you please create a tracker issue and attach a copy of your osdmap? Also, please include any other output that characterizes the slowdown in client I/O operations you're noticing in your cluster. I can take a look once I have that information, Thanks, Laura On Wed, Sep 13, 2023

[ceph-users] Re: Rebuilding data resiliency after adding new OSD's stuck for so long at 5%

2023-09-13 Thread ceph
Hi, As long as you see changes and "recovery" it will make progress so i guess you have just to wait... What kind of disks did you add? Hth Mehmet Am 12. September 2023 20:37:56 MESZ schrieb sharathvuthp...@gmail.com: >We have a user-provisioned instance( Bare Metal Installation) of OpenShift

[ceph-users] Re: Ceph services failing to start after OS upgrade

2023-09-13 Thread Robert Sander
On 12.09.23 14:51, hansen.r...@live.com.au wrote: I have a ceph cluster running on my proxmox system and it all seemed to upgrade successfully however after the reboot my ceph-mon and my ceph-osd services are failing to start or are crashing by the looks of it. You should ask that question o

[ceph-users] Rebuilding data resiliency after adding new OSD's stuck for so long at 5%

2023-09-13 Thread sharathvuthpala
We have a user-provisioned instance( Bare Metal Installation) of OpenShift cluster running on version 4.12 and we are using OpenShift Data Foundation as the Storage System. Earlier we had 3 disks attached to the storage system and 3 OSDs were available in the cluster. Today, while adding additio

[ceph-users] Ceph services failing to start after OS upgrade

2023-09-13 Thread hansen . ross
Hi There, I have a ceph cluster running on my proxmox system and it all seemed to upgrade successfully however after the reboot my ceph-mon and my ceph-osd services are failing to start or are crashing by the looks of it. ``` ceph version 17.2.6 (810db68029296377607028a6c6da1ec06f5a2b27) quincy

[ceph-users] Re: 6.5 CephFS client - ceph_cap_reclaim_work [ceph] / ceph_con_workfn [libceph] hogged CPU

2023-09-13 Thread Ilya Dryomov
On Wed, Sep 13, 2023 at 4:49 PM Stefan Kooman wrote: > > On 13-09-2023 14:58, Ilya Dryomov wrote: > > On Wed, Sep 13, 2023 at 9:20 AM Stefan Kooman wrote: > >> > >> Hi, > >> > >> Since the 6.5 kernel addressed the issue with regards to regression in > >> the readahead handling code... we went ahe

[ceph-users] Re: 6.5 CephFS client - ceph_cap_reclaim_work [ceph] / ceph_con_workfn [libceph] hogged CPU

2023-09-13 Thread Stefan Kooman
On 13-09-2023 14:58, Ilya Dryomov wrote: On Wed, Sep 13, 2023 at 9:20 AM Stefan Kooman wrote: Hi, Since the 6.5 kernel addressed the issue with regards to regression in the readahead handling code... we went ahead and installed this kernel for a couple of mail / web clusters (Ubuntu 6.5.1-060

[ceph-users] Questions about PG auto-scaling and node addition

2023-09-13 Thread Christophe BAILLON
Hello, We have a cluster with 21 nodes, each having 12 x 18TB, and 2 NVMe for db/wal. We need to add more nodes. The last time we did this, the PGs remained at 1024, so the number of PGs per OSD decreased. Currently, we are at 43 PGs per OSD. Does auto-scaling work correctly in Ceph versio

[ceph-users] Re: 6.5 CephFS client - ceph_cap_reclaim_work [ceph] / ceph_con_workfn [libceph] hogged CPU

2023-09-13 Thread Ilya Dryomov
On Wed, Sep 13, 2023 at 9:20 AM Stefan Kooman wrote: > > Hi, > > Since the 6.5 kernel addressed the issue with regards to regression in > the readahead handling code... we went ahead and installed this kernel > for a couple of mail / web clusters (Ubuntu 6.5.1-060501-generic > #202309020842 SMP PR

[ceph-users] Re: Awful new dashboard in Reef

2023-09-13 Thread Nizamudeen A
Hey Marc, We have some screenshots in a blog post we did a while back: https://ceph.io/en/news/blog/2023/landing-page/ and also in the documentation: https://docs.ceph.com/en/latest/mgr/dashboard/#overview-of-the-dashboard-landing-page Regards, On Wed, Sep 13, 2023 at 5:59 PM Marc wrote: > Scr

[ceph-users] Re: Awful new dashboard in Reef

2023-09-13 Thread Marc
Screen captures please. Not everyone is installing the default ones. > > We are collecting these feedbacks. For a while we weren't focusing on the > mobile view > of the dashboard. If there are users using those, we'll look into it as > well. Will let everyone know > soon with the improvements in

[ceph-users] Re: cannot create new OSDs - ceph version 17.2.6 (810db68029296377607028a6c6da1ec06f5a2b27) quincy (stable)

2023-09-13 Thread Igor Fedotov
Hi Martin, it looks like you're using custom osd settings. Namely: - bluestore_allocator set to bitmap (which is fine) - bluestore_min_alloc_size set to 128K The latter is apparently out-of-sync with bluefs_shared_alloc_size (set to 64K by default). Which causes the assertion at some point du

[ceph-users] CEPH zero iops after upgrade to Reef and manual read balancer

2023-09-13 Thread Mosharaf Hossain
Hello Folks We've recently performed an upgrade on our Cephadm cluster, transitioning from Ceph Quiency to Reef. However, following the manual implementation of a read balancer in the Reef cluster, we've experienced a significant slowdown in client I/O operations within the Ceph cluster, affecting

[ceph-users] Re: MDS crash after Disaster Recovery

2023-09-13 Thread Eugen Block
Hi, I would try to finish the upgrade first and bring all daemons to the same ceph version before trying any recovery. Was it a failed upgrade attempt? Can you please share 'ceph -s', 'ceph versions' and 'ceph orch upgrade status'? Zitat von Sasha BALLET : Hi, I'm struggling with a C

[ceph-users] 6.5 CephFS client - ceph_cap_reclaim_work [ceph] / ceph_con_workfn [libceph] hogged CPU

2023-09-13 Thread Stefan Kooman
Hi, Since the 6.5 kernel addressed the issue with regards to regression in the readahead handling code... we went ahead and installed this kernel for a couple of mail / web clusters (Ubuntu 6.5.1-060501-generic #202309020842 SMP PREEMPT_DYNAMIC Sat Sep 2 08:48:34 UTC 2023 x86_64 x86_64 x86_6