[ceph-users] Re: ceph Pacific - MDS activity freezes when one the MDSs is restarted

2023-06-09 Thread Emmanuel Jaep
if any (upper level) directory returns something with > for the getfattr commands. Or maybe someone documented using setfattr > for your cephfs, maybe in the command history? > > [1] > > https://docs.ceph.com/en/quincy/cephfs/multimds/#setting-subtree-partitioning-policies > &

[ceph-users] Pacific - mds How to know how many sequences still to be replayed

2023-05-29 Thread Emmanuel Jaep
Hi, I just restarted one of our mds servers. I can find some "progress" in logs as below: mds.beacon.icadmin006 Sending beacon up:replay seq 461 mds.beacon.icadmin006 received beacon reply up:replay seq 461 rtt 0 How I know how long is the sequence (ie. when the node will be finished replaying)?

[ceph-users] Pacific - MDS behind on trimming

2023-05-26 Thread Emmanuel Jaep
Hi, lately, we have had some issues with our MDSs (Ceph version 16.2.10 Pacific). Part of them are related to MDS being behind on trimming. I checked the documentation and found the following information ( https://docs.ceph.com/en/pacific/cephfs/health-messages/): > CephFS maintains a metadata j

[ceph-users] Re: Troubleshooting "N slow requests are blocked > 30 secs" on Pacific

2023-05-26 Thread Emmanuel Jaep
> > On Wed, May 24, 2023 at 5:10 PM Emmanuel Jaep > wrote: > >> Absolutely! :-) >> >> root@icadmin011:/tmp# ceph --cluster floki daemon mds.icadmin011 dump >> cache /tmp/dump.txt >> root@icadmin011:/tmp# ll >> total 48 >> drwxrwx

[ceph-users] Re: ceph Pacific - MDS activity freezes when one the MDSs is restarted

2023-05-25 Thread Emmanuel Jaep
es down all it's daemons will also > be unavailable. But we used this feature in an older version > (customized Nautilus) quite successfully in a customer cluster. > There are many things to consider here, just wanted to share a couple > of thoughts. > > Regards,

[ceph-users] Re: ceph Pacific - MDS activity freezes when one the MDSs is restarted

2023-05-25 Thread Emmanuel Jaep
> be unavailable. But we used this feature in an older version > > (customized Nautilus) quite successfully in a customer cluster. > > There are many things to consider here, just wanted to share a couple > > of thoughts. > > > > Regards, > > Eugen > > >

[ceph-users] Re: ceph Pacific - MDS activity freezes when one the MDSs is restarted

2023-05-24 Thread Emmanuel Jaep
So I guess, I'll end up doing: ceph fs set cephfs max_mds 4 ceph fs set cephfs allow_standby_replay true On Wed, May 24, 2023 at 4:13 PM Hector Martin wrote: > Hi, > > On 24/05/2023 22.02, Emmanuel Jaep wrote: > > Hi Hector, > > > > thank you very much for the

[ceph-users] Re: ceph Pacific - MDS activity freezes when one the MDSs is restarted

2023-05-24 Thread Emmanuel Jaep
more than twice the data 2. the load will be more than twice as high Am I correct? Emmanuel On Wed, May 24, 2023 at 2:31 PM Hector Martin wrote: > On 24/05/2023 21.15, Emmanuel Jaep wrote: > > Hi, > > > > we are currently running a ceph fs cluster at the following

[ceph-users] ceph Pacific - MDS activity freezes when one the MDSs is restarted

2023-05-24 Thread Emmanuel Jaep
Hi, we are currently running a ceph fs cluster at the following version: MDS version: ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable) The cluster is composed of 7 active MDSs and 1 standby MDS: RANK STATE MDS ACTIVITY DNSINOS DIRS CAPS 0

[ceph-users] Training on ceph fs

2023-05-24 Thread Emmanuel Jaep
Hi, I inherited a ceph fs cluster. Even if I have years of experience in systems management, I fail to grasp the complete logic of it fully. >From what I found on the web, the documentation is either too "high level" or too detailed. Do you know any good resources to get fully acquainted with cep

[ceph-users] Re: Troubleshooting "N slow requests are blocked > 30 secs" on Pacific

2023-05-24 Thread Emmanuel Jaep
same machine that you > are looking for /tmp/dump.txt, since the file is created on the system > which has that daemon running. > > > On Wed, May 24, 2023 at 2:16 PM Emmanuel Jaep > wrote: > >> Hi Milind, >> >> you are absolutely right. >> >> The

[ceph-users] Re: Troubleshooting "N slow requests are blocked > 30 secs" on Pacific

2023-05-24 Thread Emmanuel Jaep
c2-systemd-resolved.service-KYHd7f systemd-private-18c17b770fc24c48a0507b8faa1c0ec2-systemd-timesyncd.service-1Qtj5i Do you have any hint? Best, Emmanuel On Wed, May 24, 2023 at 10:30 AM Milind Changire wrote: > Emmanuel, > You probably missed the "daemon&q

[ceph-users] Troubleshooting "N slow requests are blocked > 30 secs" on Pacific

2023-05-24 Thread Emmanuel Jaep
Hi, we are running a cephfs cluster with the following version: ceph version 16.2.10 (45fa1a083152e41a408d15505f594ec5f1b4fe17) pacific (stable) Several MDSs are reporting slow requests: HEALTH_WARN 4 MDSs report slow requests [WRN] MDS_SLOW_REQUEST: 4 MDSs report slow requests mds.icadmin011

[ceph-users] Re: Unable to restart mds - mds crashes almost immediately after finishing recovery

2023-05-05 Thread Emmanuel Jaep
e MDS daemons ? > > Thanks > > On 5/3/23 16:01, Emmanuel Jaep wrote: > > Hi, > > > > I just inherited a ceph storage. Therefore, my level of confidence with > the tool is certainly less than ideal. > > > > We currently have an mds server that refuses to come back

[ceph-users] Re: MDS crash on FAILED ceph_assert(cur->is_auth())

2023-05-03 Thread Emmanuel Jaep
Hi, did you finally figure out what happened? I do have the same behavior and we can't get the mds to start again... Thanks, Emmanuel ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Unable to restart mds - mds crashes almost immediately after finishing recovery

2023-05-03 Thread Emmanuel Jaep
Hi, I just inherited a ceph storage. Therefore, my level of confidence with the tool is certainly less than ideal. We currently have an mds server that refuses to come back online. While reviewing the logs, I can see that, upon mds start, the recovery goes well: ``` -10> 2023-05-03T08:12:43.