Hi John, The "ceph mds metadata mds1" produced "Error ENOENT:". Querying mds metadata to mds2 and mds3 worked as expected. It seemed, only the active MDS could not be queried by Ceph MGR.
I also stated wrong that Ceph MGR spamming the syslog, it should be the ceph-mgr log itself, sorry for the confusion. # ceph -s cluster: id: b63f4ca1-f5e1-4ac1-a6fc-5ab70c65864a health: HEALTH_OK services: mon: 3 daemons, quorum mon1,mon2,mon3 mgr: mon1(active), standbys: mon2, mon3 mds: cephfs-1/1/1 up {*0=mds1=up:active*}, 2 up:standby osd: 14 osds: 14 up, 14 in rgw: 3 daemons active data: pools: 10 pools, 248 pgs objects: 583k objects, 2265 GB usage: 6816 GB used, 6223 GB / 13039 GB avail pgs: 247 active+clean 1 active+clean+scrubbing+deep io: client: 115 kB/s rd, 759 kB/s wr, 22 op/s rd, 24 op/s wr # ceph mds metadata mds1 *Error ENOENT:* # ceph mds metadata mds2 { "addr": "10.100.100.115:6800/1861195236", "arch": "x86_64", "ceph_version": "ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)", "cpu": "Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz", "distro": "ubuntu", "distro_description": "Ubuntu 16.04.4 LTS", "distro_version": "16.04", "hostname": "mds2", "kernel_description": "#1 SMP PVE 4.13.13-40 (Fri, 16 Feb 2018 09:51:20 +0100)", "kernel_version": "4.13.13-6-pve", "mem_swap_kb": "2048000", "mem_total_kb": "2048000", "os": "Linux" } # ceph mds metadata mds3 { "addr": "10.100.100.116:6800/4180418633", "arch": "x86_64", "ceph_version": "ceph version 12.2.4 (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)", "cpu": "Intel(R) Xeon(R) CPU E31240 @ 3.30GHz", "distro": "ubuntu", "distro_description": "Ubuntu 16.04.4 LTS", "distro_version": "16.04", "hostname": "mds3", "kernel_description": "#1 SMP PVE 4.13.16-47 (Mon, 9 Apr 2018 09:58:12 +0200)", "kernel_version": "4.13.16-2-pve", "mem_swap_kb": "4096000", "mem_total_kb": "2048000", "os": "Linux" } Kind regards, Charles Alva Sent from Gmail Mobile On Tue, Apr 24, 2018 at 4:29 PM, John Spray <jsp...@redhat.com> wrote: > On Fri, Apr 20, 2018 at 11:29 AM, Charles Alva <charlesa...@gmail.com> > wrote: > > Marc, > > > > Thanks. > > > > The mgr log spam occurs even without dashboard module enabled. I never > > checked the ceph mgr log before because the ceph cluster is always > healthy. > > Based on the ceph mgr logs in syslog, the spam occurred long before and > > after I enabled the dashboard module. > > > >> # ceph -s > >> cluster: > >> id: xxx > >> health: HEALTH_OK > >> > >> services: > >> mon: 3 daemons, quorum mon1,mon2,mon3 > >> mgr: mon1(active), standbys: mon2, mon3 > >> mds: cephfs-1/1/1 up {0=mds1=up:active}, 2 up:standby > >> osd: 14 osds: 14 up, 14 in > >> rgw: 3 daemons active > >> > >> data: > >> pools: 10 pools, 248 pgs > >> objects: 546k objects, 2119 GB > >> usage: 6377 GB used, 6661 GB / 13039 GB avail > >> pgs: 248 active+clean > >> > >> io: > >> client: 25233 B/s rd, 1409 kB/s wr, 6 op/s rd, 59 op/s wr > > > > > > > > My ceph mgr log is spam with following log every second. This happens on > 2 > > separate Ceph 12.2.4 clusters. > > (I assume that the mon, mgr and mds are all 12.2.4) > > The "failed to return metadata" part is kind of mysterious. Do you > also get an error if you try to do "ceph mds metadata mds1" by hand? > (that's what the mgr is trying to do). > > If the metadata works when using the CLI by hand, you may have an > issue with the mgr's auth caps, check that its mon caps are set to > "allow profile mgr". > > The "unhandled message" part is from a path where the mgr code is > ignoring messages from services that don't have any metadata (I think > this is actually a bug, as we should be considering these messages as > handled even if we're ignoring them). > > John > > >> # less +F /var/log/ceph/ceph-mgr.mon1.log > >> > >> ... > >> > >> 2018-04-20 06:21:18.782861 7fca238ff700 1 mgr send_beacon active > >> 2018-04-20 06:21:19.050671 7fca14809700 0 ms_deliver_dispatch: > unhandled > >> message 0x55bf897d1c00 mgrreport(mds.mds1 +24-0 packed 214) v5 from > mds.0 > >> 10.100.100.114:6800/4132681434 > >> 2018-04-20 06:21:19.051047 7fca25102700 1 mgr finish mon failed to > return > >> metadata for mds.mds1: (2) No such file or directory > >> 2018-04-20 06:21:20.050889 7fca14809700 0 ms_deliver_dispatch: > unhandled > >> message 0x55bf897eac00 mgrreport(mds.mds1 +24-0 packed 214) v5 from > mds.0 > >> 10.100.100.114:6800/4132681434 > >> 2018-04-20 06:21:20.051351 7fca25102700 1 mgr finish mon failed to > return > >> metadata for mds.mds1: (2) No such file or directory > >> 2018-04-20 06:21:20.784455 7fca238ff700 1 mgr send_beacon active > >> 2018-04-20 06:21:21.050968 7fca14809700 0 ms_deliver_dispatch: > unhandled > >> message 0x55bf897d0d00 mgrreport(mds.mds1 +24-0 packed 214) v5 from > mds.0 > >> 10.100.100.114:6800/4132681434 > >> 2018-04-20 06:21:21.051441 7fca25102700 1 mgr finish mon failed to > return > >> metadata for mds.mds1: (2) No such file or directory > >> 2018-04-20 06:21:22.051254 7fca14809700 0 ms_deliver_dispatch: > unhandled > >> message 0x55bf897ec100 mgrreport(mds.mds1 +24-0 packed 214) v5 from > mds.0 > >> 10.100.100.114:6800/4132681434 > >> 2018-04-20 06:21:22.051704 7fca25102700 1 mgr finish mon failed to > return > >> metadata for mds.mds1: (2) No such file or directory > >> 2018-04-20 06:21:22.786656 7fca238ff700 1 mgr send_beacon active > >> 2018-04-20 06:21:23.051235 7fca14809700 0 ms_deliver_dispatch: > unhandled > >> message 0x55bf897d0400 mgrreport(mds.mds1 +24-0 packed 214) v5 from > mds.0 > >> 10.100.100.114:6800/4132681434 > >> 2018-04-20 06:21:23.051712 7fca25102700 1 mgr finish mon failed to > return > >> metadata for mds.mds1: (2) No such file or directory > >> 2018-04-20 06:21:24.051353 7fca14809700 0 ms_deliver_dispatch: > unhandled > >> message 0x55bf897e6000 mgrreport(mds.mds1 +24-0 packed 214) v5 from > mds.0 > >> 10.100.100.114:6800/4132681434 > >> 2018-04-20 06:21:24.051971 7fca25102700 1 mgr finish mon failed to > return > >> metadata for mds.mds1: (2) No such file or directory > >> 2018-04-20 06:21:24.788228 7fca238ff700 1 mgr send_beacon active > >> 2018-04-20 06:21:25.051642 7fca14809700 0 ms_deliver_dispatch: > unhandled > >> message 0x55bf897d1900 mgrreport(mds.mds1 +24-0 packed 214) v5 from > mds.0 > >> 10.100.100.114:6800/4132681434 > >> 2018-04-20 06:21:25.052182 7fca25102700 1 mgr finish mon failed to > return > >> metadata for mds.mds1: (2) No such file or directory > >> 2018-04-20 06:21:26.051641 7fca14809700 0 ms_deliver_dispatch: > unhandled > >> message 0x55bf89835600 mgrreport(mds.mds1 +24-0 packed 214) v5 from > mds.0 > >> 10.100.100.114:6800/4132681434 > >> 2018-04-20 06:21:26.052169 7fca25102700 1 mgr finish mon failed to > return > >> metadata for mds.mds1: (2) No such file or directory > >> ... > > > > > > Kind regards, > > > > Charles Alva > > Sent from Gmail Mobile > > > > On Fri, Apr 20, 2018 at 10:57 AM, Marc Roos <m.r...@f1-outsourcing.eu> > > wrote: > >> > >> > >> Hi Charles, > >> > >> I am more or less responding to your syslog issue. I don’t have the > >> experience on cephfs to give you a reliable advice. So lets wait for the > >> experts to reply. But I guess you have to give a little more background > >> info, like > >> > >> This happened to running cluster, you didn’t apply any changes to? > >> Looks like your dashboard issue is not related to "1 mgr finish mon > >> failed to return metadata for mds.mds1" > >> > >> > >> -----Original Message----- > >> From: Charles Alva [mailto:charlesa...@gmail.com] > >> Sent: vrijdag 20 april 2018 10:33 > >> To: Marc Roos > >> Cc: ceph-users > >> Subject: Re: [ceph-users] Ceph 12.2.4 MGR spams syslog with "mon failed > >> to return metadata for mds" > >> > >> Hi Marc, > >> > >> I'm using CephFS and mgr could not get the metadata of the mds. I > >> enabled the dashboard module and everytime I visit the ceph filesystem > >> page, it got internal error 500. > >> > >> Kind regards, > >> > >> Charles Alva > >> Sent from Gmail Mobile > >> > >> > >> On Fri, Apr 20, 2018 at 9:24 AM, Marc Roos <m.r...@f1-outsourcing.eu> > >> wrote: > >> > >> > >> > >> Remote syslog server, and buffering writes to the log? > >> > >> > >> Actually this is another argument to fix logging to syslog a > bit, > >> because the default syslog is also be set to throttle and group > >> the > >> > >> messages like: > >> > >> Mar 9 17:59:35 db1 influxd: last message repeated 132 times > >> > >> > >> > >> https://www.mail-archive.com/ceph-users@lists.ceph.com/msg45025.htm > >> l > >> <https://www.mail-archive.com/ceph-users@lists.ceph.com/msg45025.html> > >> > >> > >> > >> > >> > >> -----Original Message----- > >> From: Charles Alva [mailto:charlesa...@gmail.com] > >> Sent: vrijdag 20 april 2018 8:08 > >> To: ceph-users@lists.ceph.com > >> Subject: [ceph-users] Ceph 12.2.4 MGR spams syslog with "mon > >> failed > >> to > >> return metadata for mds" > >> > >> Hi All, > >> > >> Just noticed on 2 Ceph Luminous 12.2.4 clusters, Ceph mgr spams > >> the > >> > >> syslog with lots of "mon failed to return metadata for mds" > every > >> second. > >> > >> ``` > >> 2018-04-20 06:06:03.951412 7fca238ff700 1 mgr send_beacon > active > >> 2018-04-20 06:06:04.934477 7fca14809700 0 ms_deliver_dispatch: > >> unhandled message 0x55bf897f0a00 mgrreport(mds.mds1 +24-0 packed > >> 214) v5 > >> from mds.0 10.100.100.114:6800/4132681434 2018-04-20 > >> 06:06:04.934937 > >> 7fca25102700 1 mgr finish mon failed to return metadata for > >> mds.mds1: > >> (2) No such file or directory ``` > >> > >> How to fix this issue? or disable it completely to reduce disk > IO > >> and > >> increase SSD life span? > >> > >> > >> > >> Kind regards, > >> > >> Charles Alva > >> Sent from Gmail Mobile > >> > >> > >> > >> > >> > >> > >> > > > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com