[ceph-users] Re: CephFS MDS crashing during replay with standby MDSes crashing afterwards

2024-06-25 Thread Dhairya Parmar
Hi Ivan, This looks to be similar to the issue [0] that we're already addressing at [1]. So basically there is some out-of-sync event that led the client to make use of the inodes that MDS wasn't aware of/isn't tracking and hence the crash. It'd be really helpful if you can provide us more logs.

[ceph-users] Re: Incomplete PGs. Ceph Consultant Wanted

2024-06-25 Thread Frédéric Nass
Hello Wesley, I couldn't find any tracker related to this and since min_size=1 has been involved in many critical situations with data loss, I created this one: https://tracker.ceph.com/issues/66641 Regards, Frédéric. - Le 17 Juin 24, à 19:14, Wesley Dillingham w...@wesdillingham.com a écr

[ceph-users] cephadm does not recreate OSD

2024-06-25 Thread Luis Domingues
Hello all. After a disk changed, we see that cephadm does not recreate the OSD. Going all back to pvs command I ended up on this issue: https://tracker.ceph.com/issues/62862 and this PR: https://github.com/ceph/ceph/pull/53500. The PR is unfortunately closed. Is this a non-bug? I tries to repl

[ceph-users] Re: CephFS MDS crashing during replay with standby MDSes crashing afterwards

2024-06-25 Thread Ivan Clayson
Hi Dhairya, Thank you for your rapid reply. I tried recovering the dentries for the file just before the crash I mentioned before and then splicing the transactions from the journal which seemed to remove that issue for that inode but resulted in the MDS crashing on the next inode in the journ

[ceph-users] Re: Lot of spams on the list

2024-06-25 Thread Alain Péan
Le 24/06/2024 à 19:15, Anthony D'Atri a écrit : * Subscription is now moderated * The three worst spammers (you know who they are) have been removed * I’ve deleted tens of thousands of crufty mail messages from the queue The list should work normally now. Working on the backlog of held messages

[ceph-users] Re: CephFS MDS crashing during replay with standby MDSes crashing afterwards

2024-06-25 Thread Dhairya Parmar
On Tue, Jun 25, 2024 at 6:38 PM Ivan Clayson wrote: > Hi Dhairya, > > Thank you for your rapid reply. I tried recovering the dentries for the > file just before the crash I mentioned before and then splicing the > transactions from the journal which seemed to remove that issue for that > inode bu

[ceph-users] Re: ceph rgw zone create fails EINVAL

2024-06-25 Thread Matthew Vernon
On 24/06/2024 21:18, Matthew Vernon wrote: 2024-06-24T17:33:26.880065+00:00 moss-be2001 ceph-mgr[129346]: [rgw ERROR root] Non-zero return from ['radosgw-admin', '-k', '/var/lib/ceph/mgr/ceph-moss-be2001.qvwcaq/keyring', '-n', 'mgr.moss-be2001.qvwcaq', 'realm', 'pull', '--url', 'https://apus.

[ceph-users] Re: [EXTERN] Re: Urgent help with degraded filesystem needed

2024-06-25 Thread Dietmar Rieder
Hi Patrick, Xiubo and List, finally we managed to get the filesystem repaired and running again! YEAH, I'm so happy!! Big thanks for your support Patrick and Xiubo! (Would love invite you for a beer)! Please see some comments and (important?) questions below: On 6/25/24 03:14, Patrick Do

[ceph-users] Re: Ceph Leadership Team Weekly Minutes 2024-06-17

2024-06-25 Thread Satoru Takeuchi
2024年6月18日(火) 5:49 Laura Flores : > > Need to update the OS Recommendations doc to represent latest supported > distros > - https://docs.ceph.com/en/latest/start/os-recommendations/#platforms > - PR from Zac to be reviewed CLT: https://github.com/ceph/ceph/pull/58092 > > arm64 CI check ready to be

[ceph-users] Re: [EXTERN] Re: Urgent help with degraded filesystem needed

2024-06-25 Thread Dietmar Rieder
...sending also to the list and Xiubo (were accidentally removed from recipients)... On 6/25/24 21:28, Dietmar Rieder wrote: Hi Patrick,  Xiubo and List, finally we managed to get the filesystem repaired and running again! YEAH, I'm so happy!! Big thanks for your support Patrick and Xiubo!

[ceph-users] OSD service specs in mixed environment

2024-06-25 Thread Torkil Svensgaard
Hi We have a bunch of HDD OSD hosts with DB/WAL on PCI NVMe, either 2 x 3.2TB or 1 x 6.4TB. We used to have 4 SSDs pr node for journals before bluestore and those have been repurposed for an SSD pool (wear level is fine). We've been using the following service specs to avoid the PCI NVMe de