[ceph-users] Re: MDS crashing on startup

2025-01-15 Thread Frank Schilder
Hi Dan, we finally managed to get everything up and collect debug info. Its ceph-posted since all files exceeded the limit for attachments. A quick overview of the most important findings is here: https://imgur.com/a/RF7ExSP. Please note that I started a new thread to reduce clutter: "MDS hung

[ceph-users] Re: MDS crashing on startup

2025-01-14 Thread Eugen Block
uesday, January 14, 2025 9:11 PM To: Dan van der Ster Cc: ceph-users@ceph.io Subject: [ceph-users] Re: MDS crashing on startup Hi Dan, celebrating too early. Applying our tuned profile results in: # sudo -u ceph ulimit unlimited # sysctl fs.file-max fs.file-max = 26234859 Still, the MDS aborts

[ceph-users] Re: MDS crashing on startup

2025-01-14 Thread Frank Schilder
pus Bygning 109, rum S14 From: Frank Schilder Sent: Tuesday, January 14, 2025 9:11 PM To: Dan van der Ster Cc: ceph-users@ceph.io Subject: [ceph-users] Re: MDS crashing on startup Hi Dan, celebrating too early. Applying our tuned profile results in: # sudo

[ceph-users] Re: MDS crashing on startup

2025-01-14 Thread Frank Schilder
Hi Dan, celebrating too early. Applying our tuned profile results in: # sudo -u ceph ulimit unlimited # sysctl fs.file-max fs.file-max = 26234859 Still, the MDS aborts in exactly the same way: -88> 2025-01-14T14:57:54.511-0500 7f8a88613700 0 log_channel(cluster) log [DBG] : reconnect by cl

[ceph-users] Re: MDS crashing on startup

2025-01-14 Thread Frank Schilder
Hi Dan, thanks a ton! Now I feel really stupid. I'm "a bit" under stress, so I forgot our ceph tuned profile. Thanks for reminding me and even more for providing such pointers even though I should know better on my own. Why is the message about " open file descriptions limit reached sd = " not

[ceph-users] Re: MDS crashing on startup

2025-01-14 Thread Dan van der Ster
Hi Frank, That abort looks like this: } else if (r == -EMFILE || r == -ENFILE) { lderr(msgr->cct) << __func__ << " open file descriptions limit reached sd = " << listen_socket.fd() << " errno " << r << " " << cpp_strerror(r) << dendl; if (++a