Dear cephers,

I was doing some maintenance yesterday involving shutdown-power up cycles of 
ceph servers. With the last server I run into a problem. The server runs an MDS 
and a couple of OSDs. After reboot, the MDS joined the MDS cluster without 
problems, but the OSDs didn't come up. This was 1 out of 12 servers and I had 
no such problems with the other 11. I also observed that "ceph status" was 
responding very slow.

Upon further inspection, I found out that 2 of my 3 MONs (the leader and a 
peon) were running at 100% CPU. Client I/O was continuing, probably because the 
last cluster map remained valid. On our node performance monitoring I could see 
that the 2 busy MONs were showing extraordinary network activity.

This state lasted for over one hour. After the MONs settled down, the OSDs 
finally joined as well and everything went back to normal.

The other instance I have seen similar behaviour was, when I restarted a MON on 
an empty disk and the re-sync was extremely slow due to a too large value for 
mon_sync_max_payload_size. This time, I'm pretty sure it was MON-client 
communication; see below.

Are there any settings similar to mon_sync_max_payload_size that could 
influence responsiveness of MONs in a similar way?

Why do I suspect it is MON-client communication? In our monitoring, I do not 
see the huge amount of packages sent by the MONs arriving at any other ceph 
daemon. They seem to be distributed over client nodes, but since we have a 
large count of client nodes (>550) this is covered by the background network 
traffic. A second clue is that I have had such extended lock-ups before and, 
whenever I checked, I only observed these in case the leader had a large share 
of client sessions.

For example, yesterday the client session count per MON was:

ceph-01: 1339 (leader)
ceph-02:  189 (peon)
ceph-03:  839 (peon)

I usually restart the leader when such a critical distribution occurs. As long 
as the leader has the fewest client sessions, I never observe this problem.

Ceph version is 13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic 
(stable).

Thanks for any clues!

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to