Fellow cephalopods,

I'm trying to get quick, seamless NFS failover happening on my four-node Ceph cluster.

I followed the instructions here:

but testing shows that failover doesn't happen. When I placed node 2 ("san2") in maintenance mode, the NFS service shut down:

Aug 24 14:19:03 san2 
24/08/2023 04:19:03 : epoch 64b8af5a : san2 : ganesha.nfsd-8[Admin] do_shutdown 
:MAIN :EVENT :Removing all exports.
Aug 24 14:19:13 san2 bash[3235994]: time="2023-08-24T14:19:13+10:00" level=warning 
msg="StopSignal SIGTERM failed to stop container 
ceph-e2f1b934-ed43-11ec-80fa-04421a1a1d66-nfs-xcpnfs-1-0-san2-datsvq in 10 seconds, resorting to 
Aug 24 14:19:13 san2 bash[3235994]: 
Aug 24 14:19:13 san2 
 Main process exited, code=exited, status=137/n/a
Aug 24 14:19:14 san2 
 Failed with result 'exit-code'.
Aug 24 14:19:14 san2 systemd[1]: Stopped Ceph nfs.xcpnfs.1.0.san2.datsvq for 

And that's it. The ingress IP didn't move.

More odd, the cluster seems to have placed the ingress IP on node 1 (san1) but seems to be using the NFS service on node 2.

Do I need to more tightly connect the NFS service to the keepalive and haproxy services, or do I need to expand the ingress services to refer to multiple NFS services?

Thank you.



Thorne Lawler - Senior System Administrator
*DDNS* | ABN 76 088 607 265
First registrar certified ISO 27001-2013 Data Security Standard ITGOV40172
P +61 499 449 170


/_*Please note:* The information contained in this email message and any attached files may be confidential information, and may also be the subject of legal professional privilege. _If you are not the intended recipient any use, disclosure or copying of this email is unauthorised. _If you received this email in error, please notify Discount Domain Name Services Pty Ltd on 03 9815 6868 to report this matter and delete all copies of this transmission together with any attachments. /
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to