Hi,
> Like last time, after I restarted all five MONs, the store size
> decreased and everything went back to normal. I also had to restart MGRs
> and MDSs afterwards. This starts looking like a bug to me.
In our case, we had a real database corruption in the rocksdb that
caused version counter
We just had the same problem again after a power outage that took out
62% of our cluster and three out of five MONs. Once everything was back
up, the MONs started lagging and piling up slow ops while to MON store
was growing to double-digit gigabytes. It was so bad that I couldn't
even list the
Since the full cluster restart and disabling logging to syslog, it's not
a problem any more (for now).
Unfortunately, just disabling clog_to_monitors didn't have the wanted
effect when I tried it yesterday. But I also believe that it is somehow
related. I could not find any specific reason for
On Thu, Feb 25, 2021 at 08:58:01PM +0100, Janek Bevendorff wrote:
> On the first MON, the command doesn’t even return, but I was able to
> get a dump from the one I restarted most recently. The oldest ops
> look like this:
>
> {
> "description": "log(1000 entries from seq 17876
> On 25. Feb 2021, at 22:17, Dan van der Ster wrote:
>
> Also did you solve your log spam issue here?
> https://tracker.ceph.com/issues/49161
> Surely these things are related?
No. But I noticed that DBG log spam only happens when log_to_syslog is enabled.
systemd is smart enough to avoid fi
Also did you solve your log spam issue here?
https://tracker.ceph.com/issues/49161
Surely these things are related?
You might need to share more full logs from cluster, mon, osd, mds,
mgr so that we can help get to the bottom of this.
-- dan
On Thu, Feb 25, 2021 at 10:04 PM Janek Bevendorff
wro
Thanks, I’ll try that tomorrow.
> On 25. Feb 2021, at 21:59, Dan van der Ster wrote:
>
> Maybe the debugging steps in that insights tracker can be helpful
> anyway: https://tracker.ceph.com/issues/39955
>
> -- dan
>
> On Thu, Feb 25, 2021 at 9:27 PM Janek Bevendorff
> wrote:
>>
>> Thanks fo
Maybe the debugging steps in that insights tracker can be helpful
anyway: https://tracker.ceph.com/issues/39955
-- dan
On Thu, Feb 25, 2021 at 9:27 PM Janek Bevendorff
wrote:
>
> Thanks for the tip, but I do not have degraded PGs and the module is already
> disabled.
>
>
> On 25. Feb 2021, at 2
Thanks for the tip, but I do not have degraded PGs and the module is already
disabled.
> On 25. Feb 2021, at 21:17, Seena Fallah wrote:
>
> I had the same problem in my cluster and it was because of insights mgr
> module that was storing lots of data to the RocksDB because mu cluster was
> d
I had the same problem in my cluster and it was because of insights mgr
module that was storing lots of data to the RocksDB because mu cluster was
degraded.
If you have degraded pgs try to disable insights module.
On Thu, Feb 25, 2021 at 11:40 PM Dan van der Ster
wrote:
> > "source": "osd.104...
Nothing special is going on that OSD as far as I can tell and the OSD number of
each op is different.
The config isn’t entirely default, but we have been using it successfully for
quite a bit. It basically just redirects everything to journald so that we
don’t have log creep. I reverted it nonet
> "source": "osd.104...
What's happening on that osd? Is it something new which corresponds to when
your mon started growing? Are other OSDs also flooding the mons with logs?
I'm mobile so can't check... Are those logging configs the defaults? If not
revert to default...
BTW do your mons ha
Thanks, Dan.
On the first MON, the command doesn’t even return, but I was able to get a dump
from the one I restarted most recently. The oldest ops look like this:
{
"description": "log(1000 entries from seq 17876238 at
2021-02-25T15:13:20.306487+0100)",
"initiat
ceph daemon mon.`hostname -s` ops
That should show you the accumulating ops.
.. dan
On Thu, Feb 25, 2021, 8:23 PM Janek Bevendorff <
janek.bevendo...@uni-weimar.de> wrote:
> Hi,
>
> All of a sudden, we are experiencing very concerning MON behaviour. We
> have five MONs and all of them have tho
14 matches
Mail list logo