Hi Frank,
It's possible that certain parameters you modified at some point, which may
have helped the MDS to start up, are now slowing down its operation or
preventing it from going further. In that case, resetting these parameters to
their default values could help. Just a thought.
Another th
Hi Eugen,
as promised the result. Unfortunately, increasing this parameter seems not to
help. Was worth a try though. I will keep the MDS running and check again
tomorrow.
Its really annoying that it doesn't come back. Following the reports of other
people who were in a similar situation it sh
Hi Eugen,
thanks and yes, let's try one thing at a time. I will report back.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
From: Eugen Block
Sent: Saturday, January 11, 2025 10:39 PM
To: Frank Schilder
Cc: ceph-users
Personally, I would only try one change at a time and wait for a
result. Otherwise it can get difficult to tell what exactly helped and
what not.
I have never played with auth_service_ticket_ttl yet, so I can only
refer to the docs here:
When the Ceph Storage Cluster sends a ticket for auth
Hi Eugen,
thanks for your reply! Its a long shot, but worth trying. I will give it a go.
Since you are following: I also observed cephx timeouts. I'm considering to
increase the ttl for auth tickets. Do you think auth_service_ticket_ttl
(default 3600) is the right parameter? If so, can I just c
But why do you need to disable selinux for the service to work? You
shouldn't have an issue.
On Fri, Jan 10, 2025, 6:20 PM Jorge Garcia wrote:
> Actually, stupid mistake on my part. I had selinux mode as enforcing.
> Changed it to disabled, and everything works again. Thanks for the
> help!
> __
Hi Frank,
not sure if this already has been mentioned, but this one has 60
seconds timeout:
mds_beacon_mon_down_grace
ceph config help mds_beacon_mon_down_grace
mds_beacon_mon_down_grace - tolerance in seconds for missed MDS
beacons to monitors
(secs, advanced)
Default: 60
Can updat
And another small piece of information:
Needed to do another restart. This time I managed to capture the approximate
length of the period for which the MDS is up and responsive after loading the
cache (it reports stats). Its pretty much exactly 60 seconds. This smells like
a timeout. Is there a
Hi all,
my hopes are down again. The MDS might look busy but I'm not sure its doing
anything interesting. I now see a lot of these in the log (stripped the
heartbeat messages):
2025-01-11T12:35:50.712+0100 7ff888375700 -1 monclient: _check_auth_rotating
possible clock skew, rotating keys expir
Hi all,
new update: after sleeping after the final MDS restart the MDS is doing
something! It is still unresponsive, but it does show CPU load of between
150-200% and I really really hope that this is the trimming of stray items.
I will try to find out if I get perf to work inside the container
10 matches
Mail list logo