I tried something else, but the result is not really satifying. I edited the keepalive.conf files which had no peers at all or only one peer, so they were all identical. Restarting the daemons helped having only one virtual ip assigned, so now the daemons did communicate and I see messages like these:

Master received advert from 192.168.168.112 with same priority 80 but higher IP address than ours
Entering BACKUP STATE

So that's good. But powering off the machine with the active nfs daemon doesn't provide the expected result. Although keepalive assigns the virtual ip to a different host, the failed nfs daemon lands on the third node, so mounting is not possible.

To prevent that from happening, I reduced the number of hosts for nfs and ingress to two. And that seems to work as expected (after modifying the keepalive.conf again). But all in all, the keepalive_only option seems a bit too much manual work at this point.

And just a side note: I don't see that a client is connected although I am writing data into the nfs export. Both the dashboard and CLI show no client:

ceph nfs export info ebl-nfs-cephfs /nfsovercephfs
{
  "access_type": "RW",
  "clients": [],
  "cluster_id": "ebl-nfs-cephfs",
...

I only see the active nfs daemon as a CephFS client.


Zitat von Eugen Block <ebl...@nde.ag>:

Thanks, I removed the ingress service and redeployed it again, with the same result. The interesting part here is, the configs are identical compared to the previous deployment, so the same peers (or no peers) as before.

Zitat von Robert Sander <r.san...@heinlein-support.de>:

Am 3/25/25 um 18:55 schrieb Eugen Block:
Okay, so I don't see anything in the keepalive log about communicating between each other. The config files are almost identical, no difference in priority, but in unicast_peer. ceph03 has no entry at all for unicast_peer, ceph02 has only ceph03 in there while ceph01 has both of the others in its unicast_peer entry. That's weird, isn't it?

They should each have the other two as unicast_peers.
There must have been a glitch in the service generation. Maybe you should try to remove it and deploy it as new?

Regards
--
Robert Sander
Linux Consultant

Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

https://www.heinlein-support.de

Tel: +49 30 405051 - 0
Fax: +49 30 405051 - 19

Amtsgericht Berlin-Charlottenburg - HRB 220009 B
Geschäftsführer: Peer Heinlein - Sitz: Berlin
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to