[ceph-users] Re: Help with HA NFS

Eugen Block Wed, 23 Apr 2025 00:56:26 -0700

I couldn't get it to work properly either. Although a different hostgets the virtual ip assigned, and I can even ls into the mounteddirectory, and also remount it, I can't write anything. Only after thehost comes back from maintenance the write requests continue. Thatdoesn't really feel like highly available, tbh.


Zitat von "Devin A. Bougie" <devin.bou...@cornell.edu>:

Thanks, Alexander! We’re running fully-updated AlmaLinux 9.5 onboth the servers and the clients.
I thought we’d give the Ceph NFS service a try, but certainly havemore experience with pacemaker / corosync (and standalone NFSservers). I guess we’ll go that route unless anyone else has anyideas.
And just one more quick update, after some offline exchanges weremoved the count limit and now have multiple ingress.nfs.cephfsservice instances. That hasn’t changed the behavior, however, WRTlosing one of the backend nfs.cephfs daemons.
———
[root@cephman1 ~]# ceph orch ls --service_name=ingress.nfs.cephfs --export
service_type: ingress
service_id: nfs.cephfs
service_name: ingress.nfs.cephfs
placement:
  label: _admin
spec:
  backend_service: nfs.cephfs
  first_virtual_router_id: 50
  frontend_port: 2049
  monitor_port: 9049
  virtual_ip: virtual_ip/prefix

[root@cephman1 ~]# ceph orch ls --service_name=nfs.cephfs --export
service_type: nfs
service_id: cephfs
service_name: nfs.cephfs
placement:
  label: _admin
spec:
  port: 12049
———

Thanks again,
Devin
On Apr 22, 2025, at 7:33 PM, Alexander Patrakov <patra...@gmail.com> wrote:

Hello Devin,

An important additional detail is missing: which OS is used as a client?

And yes, my default recommendation would be to move the NFS server out
of the Ceph cluster.

On Wed, Apr 23, 2025 at 6:29 AM Devin A. Bougie
<devin.bou...@cornell.edu> wrote:
Hello,
We’ve found that if we lose one of the nfs.cephfs service daemonsin our cephadm 19.2.2 cluster, all NFS traffic is blocked untileither:
- the down nfs.cephfs daemon is restarted
- or we reconfigure the placement of the nfs.cephs service to notuse the affected host. After this, the ingress.nfs.cephfs serviceis automatically reconfigured and everything resumes
Our current setup follows the "HIGH-AVAILABILITY NFS”documentation, which gives us an ingress.nfs.cephfs service withthe haproxy and keepalived daemons and an nfs.cephfs service forthe actual nfs daemons. This service was deployed using:ceph nfs cluster create cephfs "label:_admin" --ingress--virtual_ip virtual_ip
And then we updated the ingress.nfs.cephfs service to only deploya single service (which in this case, results in two daemons on asingle host).
This gives us the following:
———
[root@cephman1 ~]# ceph orch ls --service_name=ingress.nfs.cephfs --export
service_type: ingress
service_id: nfs.cephfs
service_name: ingress.nfs.cephfs
placement:
 count: 1
 label: _admin
spec:
 backend_service: nfs.cephfs
 first_virtual_router_id: 50
 frontend_port: 2049
 monitor_port: 9049
 virtual_ip: virtual_ip/prefix

[root@cephman1 ~]# ceph orch ls --service_name=nfs.cephfs --export
service_type: nfs
service_id: cephfs
service_name: nfs.cephfs
placement:
 label: _admin
spec:
 port: 12049
———
Can anyone show us the config for a true “HA” nfs service wherethey can lose any single host without impacting access to the NFSexport from clients? I would expect to be able to lose the hostrunning the ingress.nfs.cephfs service, and have it automaticallyrestarted on a different host. Likewise, I would expect to beable to lose an nfs.cephs daemon without impacting access to theexport.
Or should we be taking a completely different approach and moveour NFS service out of Ceph and into our pacemaker / corosynccluster?
Sorry if this sounds redundant to questions I’ve previously asked,but we’ve reconfigured things a little and it feels like we’regetting closer with each attempt?
Many thanks,
Devin
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
--
Alexander Patrakov
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Help with HA NFS

Reply via email to