[ceph-users] Re: Reef: highly-available NFS with keepalive_only

Devin A. Bougie Sat, 05 Apr 2025 10:18:04 -0700

Hi Eugen,

I’m not sure if this helps, and I would greatly appreciate any suggestions for 
improving our setup, but so far we’ve had good luck with our service deployed 
using:
ceph nfs cluster create cephfs "label:_admin" --ingress --virtual_ip virtual_ip


And then we manually updated the nfs.cephfs service this created to place the 
nfs daemons on our OSD nodes.

This gives us the following:
———
service_type: ingress
service_id: nfs.cephfs
service_name: ingress.nfs.cephfs
placement:
  label: _admin
spec:
  backend_service: nfs.cephfs
  first_virtual_router_id: 50
  frontend_port: 2049
  monitor_port: 9049
  virtual_ip: virtual_ip/prefix

service_type: nfs
service_id: cephfs
service_name: nfs.cephfs
placement:
  label: osd
spec:
  port: 12049
———

Given that we have 5 dedicated management / admin nodes and 5 separate OSD 
nodes, we then have:

———
[root@cephadmin1 ~]# ceph orch ls --service_name=ingress.nfs.cephfs
NAME                PORTS                   RUNNING  REFRESHED  AGE  PLACEMENT
ingress.nfs.cephfs  virtual_ip:2049,9049    10/10  8m ago     10w  label:_admin

[root@cephadmin1 ~]# ceph orch ls --service_name=nfs.cephfs
NAME        PORTS    RUNNING  REFRESHED  AGE  PLACEMENT
nfs.cephfs  ?:12049      5/5  8m ago     7w   label:osd
———

At least during testing, failover seemed to work properly.   We’re still very 
new to Ceph, however, so would greatly appreciate knowing if anyone sees any 
problems with this setup or has suggestions for improvement.  For example, 
we’re still unsure if it would be better to have the ingress.nfs.cephfs and 
nfs.cephfs services running on the same nodes, if one or both should be running 
on the dedicated OSD nodes, etc.

Thanks!
Devin

> On Mar 25, 2025, at 11:18 AM, Eugen Block <ebl...@nde.ag> wrote:
>
> Thanks, Adam.
> I just tried it with 3 keepalive daemons and one nfs daemon, it doesn't 
> really work because all three hosts have the virtual IP assigned, preventing 
> my client from mounting. So this doesn't really work as a workaround, it 
> seems. I feel like the proper solution would be to include keepalive in the 
> list of RESCHEDULE_FROM_OFFLINE_HOSTS_TYPES.
>
> Zitat von Adam King <adk...@redhat.com>:
>
>> Which daemons get moved around like that is controlled by
>> https://github.com/ceph/ceph/blob/main/src/pybind/mgr/cephadm/utils.py#L30,
>> which appears to only include nfs and haproxy, so maybe this keepalive only
>> case was missed in that sense. I do think that you could alter the
>> placement of the ingress service to just match all the hosts and it ought
>> to work though. The only reason we require specifying a set of hosts with a
>> count field lower than the number of hosts matching the placement for nfs
>> is that ganesha has no transparent state migration. Keepalive on the other
>> hand should work fine with "extra" daemons deployed in order to be highly
>> available.
>>
>> On Tue, Mar 25, 2025 at 10:06 AM Malte Stroem <malte.str...@gmail.com>
>> wrote:
>>
>>> Hi Eugen,
>>>
>>> yes, for me it's kind of "test-setting" for small setups.
>>>
>>> Doc says:
>>>
>>> Setting --ingress-mode keepalive-only deploys a simplified ingress
>>> service that provides a virtual IP with the nfs server directly binding
>>> to that virtual IP and leaves out any sort of load balancing or traffic
>>> redirection. This setup will restrict users to deploying only 1 nfs
>>> daemon as multiple cannot bind to the same port on the virtual IP.
>>>
>>> Best,
>>> Malte
>>>
>>>
>>> On 25.03.25 13:46, Eugen Block wrote:
>>> > Yeah, it seems to work without the "keepalive-only" flag, at least from
>>> > a first test. So keepalive-only is not working properly, it seems?
>>> > Should I create a tracker for that or am I misunderstanding its purpose?
>>> >
>>> > Zitat von Malte Stroem <malte.str...@gmail.com>:
>>> >
>>> >> Hi Eugen,
>>> >>
>>> >> try omitting
>>> >>
>>> >> --ingress-mode keepalive-only
>>> >>
>>> >> like this
>>> >>
>>> >> ceph nfs cluster create ebl-nfs-cephfs "1 ceph01 ceph02 ceph03" --
>>> >> ingress --virtual_ip "192.168.168.114/24"
>>> >>
>>> >> Best,
>>> >> Malte
>>> >>
>>> >> On 25.03.25 13:25, Eugen Block wrote:
>>> >>> Thanks for your quick response. The specs I pasted are actually the
>>> >>> result of deploying a nfs cluster like this:
>>> >>>
>>> >>> ceph nfs cluster create ebl-nfs-cephfs "1 ceph01 ceph02 ceph03" --
>>> >>> ingress --virtual_ip 192.168.168.114 --ingress-mode keepalive-only
>>> >>>
>>> >>> I can try redeploying it via dashboard, but I don't have a lot of
>>> >>> confidence that it will work differently with a failover.
>>> >>>
>>> >>> Zitat von Malte Stroem <malte.str...@gmail.com>:
>>> >>>
>>> >>>> Hi Eugen,
>>> >>>>
>>> >>>> try deploying the NFS service like this:
>>> >>>>
>>> >>>> https://docs.ceph.com/en/latest/mgr/nfs/
>>> >>>>
>>> >>>> Some had only success deploying it via the dashboard.
>>> >>>>
>>> >>>> Best,
>>> >>>> Malte
>>> >>>>
>>> >>>> On 25.03.25 13:02, Eugen Block wrote:
>>> >>>>> Hi,
>>> >>>>>
>>> >>>>> I'm re-evaluating NFS again, testing on a virtual cluster with
>>> >>>>> 18.2.4. For now, I don't need haproxy so I use "keepalive_only:
>>> >>>>> true" as described in the docs [0]. I first create the ingress
>>> >>>>> service, wait for it to start, then create the nfs cluster. I've
>>> >>>>> added the specs at the bottom.
>>> >>>>>
>>> >>>>> I can mount the export with the virtual ip. Then I just shut down
>>> >>>>> the VM where the nfs service was running, the orchestrator
>>> >>>>> successfully starts a nfs daemon elsewhere, but the keepalive
>>> >>>>> daemon is not failed over. So mounting or accessing the export is
>>> >>>>> impossible, of course. And after I power up the offline host again,
>>> >>>>> nothing is "repaired", keepalive and nfs run on different servers
>>> >>>>> until I intervene manually. This doesn't seem to work as expected,
>>> >>>>> is this a known issue (couldn't find anything on tracker)? I have
>>> >>>>> my doubts, but maybe it works better with haproxy? Or am I missing
>>> >>>>> something in my configuration?
>>> >>>>> I haven't tried with a newer release yet. I'd appreciate any
>>> comments.
>>> >>>>>
>>> >>>>> Thanks,
>>> >>>>> Eugen
>>> >>>>>
>>> >>>>> ---snip---
>>> >>>>> service_type: ingress
>>> >>>>> service_id: nfs.ebl-nfs-cephfs
>>> >>>>> service_name: ingress.nfs.ebl-nfs-cephfs
>>> >>>>> placement:
>>> >>>>>   count: 1
>>> >>>>>   hosts:
>>> >>>>>   - ceph01
>>> >>>>>   - ceph02
>>> >>>>>   - ceph03
>>> >>>>> spec:
>>> >>>>>   backend_service: nfs.ebl-nfs-cephfs
>>> >>>>>   first_virtual_router_id: 50
>>> >>>>>   keepalive_only: true
>>> >>>>>   monitor_port: 9049
>>> >>>>>   virtual_ip: 192.168.168.114/24
>>> >>>>>
>>> >>>>>
>>> >>>>> service_type: nfs
>>> >>>>> service_id: ebl-nfs-cephfs
>>> >>>>> service_name: nfs.ebl-nfs-cephfs
>>> >>>>> placement:
>>> >>>>>   count: 1
>>> >>>>>   hosts:
>>> >>>>>   - ceph01
>>> >>>>>   - ceph02
>>> >>>>>   - ceph03
>>> >>>>> spec:
>>> >>>>>   port: 2049
>>> >>>>>   virtual_ip: 192.168.168.114
>>> >>>>> ---snip---
>>> >>>>>
>>> >>>>> [0] https://docs.ceph.com/en/reef/cephadm/services/nfs/#nfs-with-
>>> >>>>> virtual-ip-but-no-haproxy
>>> >>>>> _______________________________________________
>>> >>>>> ceph-users mailing list -- ceph-users@ceph.io
>>> >>>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>> >>>
>>> >>>
>>> >>>
>>> >
>>> >
>>> >
>>> _______________________________________________
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>
>
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Reef: highly-available NFS with keepalive_only

Reply via email to