[ceph-users] Urgent help with degraded filesystem needed

2024-06-19 Thread Dietmar Rieder
Hello cephers, we have a degraded filesystem on our ceph 18.2.2 cluster and I'd need to get it up again. We have 6 MDS daemons and (3 active, each pinned to a subtree, 3 standby) It started this night, I got the first HEALTH_WARN emails saying: HEALTH_WARN --- New --- [WARN] MDS_CLIENT_RECA

[ceph-users] Re: How to change default osd reweight from 1.0 to 0.5

2024-06-19 Thread Sinan Polat
Are the weights correctly set? So 1.6 for a 1.6TB disk and 1.0 for 1TB disks and so on. > Op 19 jun 2024 om 08:32 heeft Jan Marquardt het volgende > geschreven: > >  >> Our ceph cluster uses 260 osds. >> The most highest osd usage is 87% But, The most lowest is under 40%. >> We consider low

[ceph-users] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Xiubo Li
Hi Dietmar, On 6/19/24 15:43, Dietmar Rieder wrote: Hello cephers, we have a degraded filesystem on our ceph 18.2.2 cluster and I'd need to get it up again. We have 6 MDS daemons and (3 active, each pinned to a subtree, 3 standby) It started this night, I got the first HEALTH_WARN emails sa

[ceph-users] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Joachim Kraftmayer
Hi Dietmar, have you already blocked all cephfs clients? Joachim *Joachim Kraftmayer* CEO | p: +49 89 2152527-21 | e: joachim.kraftma...@clyso.com a: Loristr. 8 | 80335 Munich | Germany | w: https://clyso.com | Utting a. A. | HR: Augsburg | HRB 25866 | USt. ID: DE275430677 Am Mi., 19. Juni 20

[ceph-users] Re: [EXTERN] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Dietmar Rieder
Hi Joachim, I suppose that setting the filesystem down will block all clients: ceph fs set cephfs down true right? Dietmar On 6/19/24 10:02, Joachim Kraftmayer wrote: Hi Dietmar, have you already blocked all cephfs clients? Joachim *Joachim Kraftmayer* CEO | p: +49 89 2152527-21 | e: joac

[ceph-users] Re: [EXTERN] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Dietmar Rieder
I did that after I have seen that it seems to be more severe, see my action "log" below. Dietmar On 6/19/24 10:05, Dietmar Rieder wrote: Hi Joachim, I suppose that setting the filesystem down will block all clients: ceph fs set cephfs down true right? Dietmar On 6/19/24 10:02, Joachi

[ceph-users] Re: [EXTERN] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Dietmar Rieder
Hi Xiubo, On 6/19/24 09:55, Xiubo Li wrote: Hi Dietmar, On 6/19/24 15:43, Dietmar Rieder wrote: Hello cephers, we have a degraded filesystem on our ceph 18.2.2 cluster and I'd need to get it up again. We have 6 MDS daemons and (3 active, each pinned to a subtree, 3 standby) It started th

[ceph-users] Re: [EXTERN] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Xiubo Li
On 6/19/24 16:13, Dietmar Rieder wrote: Hi Xiubo, [...] 0> 2024-06-19T07:12:39.236+ 7f90fa912700 -1 *** Caught signal (Aborted) **  in thread 7f90fa912700 thread_name:md_log_replay  ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2) reef (stable)  1: /lib64/libpthre

[ceph-users] Re: [EXTERN] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Dietmar Rieder
On 6/19/24 10:30, Xiubo Li wrote: On 6/19/24 16:13, Dietmar Rieder wrote: Hi Xiubo, [...] 0> 2024-06-19T07:12:39.236+ 7f90fa912700 -1 *** Caught signal (Aborted) **  in thread 7f90fa912700 thread_name:md_log_replay  ceph version 18.2.2 (531c0d11a1c5d39fbfe6aa8a521f023abf3bf3e2)

[ceph-users] Re: [EXTERN] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Stefan Kooman
Hi, On 19-06-2024 11:15, Dietmar Rieder wrote: Please follow https://docs.ceph.com/en/nautilus/cephfs/disaster-recovery-experts/#disaster-recovery-experts. OK, when I run the cephfs-journal-tool I get an error: # cephfs-journal-tool journal export backup.bin Error ((22) Invalid argument) My

[ceph-users] Re: [EXTERN] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Dietmar Rieder
Hi, On 6/19/24 12:14, Stefan Kooman wrote: Hi, On 19-06-2024 11:15, Dietmar Rieder wrote: Please follow https://docs.ceph.com/en/nautilus/cephfs/disaster-recovery-experts/#disaster-recovery-experts. OK, when I run the cephfs-journal-tool I get an error: # cephfs-journal-tool journal export

[ceph-users] Re: [EXTERN] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Dietmar Rieder
On 6/19/24 11:15, Dietmar Rieder wrote: On 6/19/24 10:30, Xiubo Li wrote: On 6/19/24 16:13, Dietmar Rieder wrote: Hi Xiubo, [...] 0> 2024-06-19T07:12:39.236+ 7f90fa912700 -1 *** Caught signal (Aborted) **  in thread 7f90fa912700 thread_name:md_log_replay  ceph version 18.2.2 (5

[ceph-users] multisite sync policy in reef 18.2.2

2024-06-19 Thread Christopher Durham
hi, I want to know whether the multisite sync policy can handle the following scenario: 1. bidirectional/symmetric replication of all buckets with names beginning  'devbuckets-*'2, replication of all prefixes in those buckets, EXCEPT, say, 'tmp/' The documentation is not clear on this and  want

[ceph-users] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Patrick Donnelly
Hi Dietmar, On Wed, Jun 19, 2024 at 3:44 AM Dietmar Rieder wrote: > > Hello cephers, > > we have a degraded filesystem on our ceph 18.2.2 cluster and I'd need to > get it up again. > > We have 6 MDS daemons and (3 active, each pinned to a subtree, 3 standby) > > It started this night, I got the f

[ceph-users] Re: How to change default osd reweight from 1.0 to 0.5

2024-06-19 Thread Anthony D'Atri
I’ve thought about this strategy in the past. I think you might enter a cron job to reset any OSDs at 1.0 to 0.5, but really the balancer module or JJ balancer is a better idea than old-style reweight. > On Jun 19, 2024, at 2:22 AM, 서민우 wrote: > > Hello~ > > Our ceph cluster uses 260 osd

[ceph-users] Re: [EXTERN] Re: Urgent help with degraded filesystem needed

2024-06-19 Thread Dietmar Rieder
Hi Patrick, thanks for your message, see my comments below. (BTW it seem that there is an issue with the ceph mailing list, my previous message did not go through yet, so this may be redundant) On 6/19/24 17:27, Patrick Donnelly wrote: Hi Dietmar, On Wed, Jun 19, 2024 at 3:44 AM Dietmar Ried

[ceph-users] ceph rgw zone create fails EINVAL

2024-06-19 Thread Matthew Vernon
Hi, I'm running cephadm/reef 18.2.2. I'm trying to set up multisite. I created realm/zonegroup/master zone OK (I think!), edited the zonegroup json to include hostnames. I have this spec file for the secondary zone: rgw_zone: codfw rgw_realm_token: "SECRET" placement: label: "rgw" [I get

[ceph-users] Re: ceph rgw zone create fails EINVAL

2024-06-19 Thread Adam King
I think this is at least partially a code bug in the rgw module. Where it's actually failing in the traceback is generating the return message for the user at the end, because it assumes `created_zones` will always be a list of strings and that seems to not be the case in any error scenario. That c

[ceph-users] Re: Monitoring

2024-06-19 Thread adam.ther
Hello, On this topic, I was trying to use Zabbix for alerting. Is there a way to make the API Key used in the dashboard not expire after a period? Regards, Adam On 6/18/24 09:12, Anthony D'Atri wrote: I don't, I have the fleetwide monitoring / observability systems query ceph_exporter and