Hi all,

We are seeing this several times. Some of our MDS stop reporting stats for no 
obvious reason. And a rolling restart of all MDS in question could resolve 
this. But restarting active MDS could cause downtime up to several minutes, we 
don’t want to do this constantly.

Client count, MDS version info are also missing from “ceph fs status” and web 
dashboard. Prometheus metrics are also affected. But “ceph tell 
mds.cephfs.gpu018.ovxvoz session ls” reports correct client sessions.

Also, the new "cephfs-top" does not work for us, It only shows a lot of N/A. I 
don't know if it is related.

Apart from these, the actual metadata operations seem to work fine.

How can I identify the root cause? Is this a known bug?

Thanks,
Weiwen Hu

$ ceph fs status
cephfs - 0 clients
======
RANK      STATE               MDS              ACTIVITY     DNS    INOS   DIRS  
 CAPS
 0        active      cephfs.gpu018.ovxvoz  Reqs:    0 /s     0      0      0   
   0
 1        active      cephfs.gpu006.ddpekw  Reqs:    0 /s     0      0      0   
   0
1-s   standby-replay  cephfs.gpu023.aetiph  Evts:    0 /s     0      0      0   
   0
0-s   standby-replay  cephfs.gpu024.rpfbnh  Evts:   69 /s  2242k  2242k  11.5k  
   0
          POOL              TYPE     USED  AVAIL
   cephfs.cephfs.meta     metadata   127G   523G
   cephfs.cephfs.data       data    74.6T  15.8T
 cephfs.cephfs.data_ssd     data       0    785G
cephfs.cephfs.data_mixed    data    8768G   523G
                                    VERSION                                     
                             DAEMONS
                                      None                                      
 cephfs.gpu018.ovxvoz, cephfs.gpu006.ddpekw, cephfs.gpu023.aetiph
ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable) 
                       cephfs.gpu024.rpfbnh

Note a lot of “0”, and 3 of the MDS are missing version info
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to