> > I just tried it with 3 keepalive daemons and one nfs daemon, it > doesn't really work because all three hosts have the virtual IP > assigned, preventing my client from mounting. So this doesn't really > work as a workaround, it seems.
That's a bit surprising. The keepalive daemons are meant to communicate to each other and have only one maintain the VIP while the others are in a standby state. Do you see anything in the logs about them communicating to each other? Also, in the keepalive conf for each keepalive, do you see one set with a higher "priority" setting than the other? On Tue, Mar 25, 2025 at 11:18 AM Eugen Block <ebl...@nde.ag> wrote: > Thanks, Adam. > I just tried it with 3 keepalive daemons and one nfs daemon, it > doesn't really work because all three hosts have the virtual IP > assigned, preventing my client from mounting. So this doesn't really > work as a workaround, it seems. I feel like the proper solution would > be to include keepalive in the list of > RESCHEDULE_FROM_OFFLINE_HOSTS_TYPES. > > Zitat von Adam King <adk...@redhat.com>: > > > Which daemons get moved around like that is controlled by > > > https://github.com/ceph/ceph/blob/main/src/pybind/mgr/cephadm/utils.py#L30 > , > > which appears to only include nfs and haproxy, so maybe this keepalive > only > > case was missed in that sense. I do think that you could alter the > > placement of the ingress service to just match all the hosts and it ought > > to work though. The only reason we require specifying a set of hosts > with a > > count field lower than the number of hosts matching the placement for nfs > > is that ganesha has no transparent state migration. Keepalive on the > other > > hand should work fine with "extra" daemons deployed in order to be highly > > available. > > > > On Tue, Mar 25, 2025 at 10:06 AM Malte Stroem <malte.str...@gmail.com> > > wrote: > > > >> Hi Eugen, > >> > >> yes, for me it's kind of "test-setting" for small setups. > >> > >> Doc says: > >> > >> Setting --ingress-mode keepalive-only deploys a simplified ingress > >> service that provides a virtual IP with the nfs server directly binding > >> to that virtual IP and leaves out any sort of load balancing or traffic > >> redirection. This setup will restrict users to deploying only 1 nfs > >> daemon as multiple cannot bind to the same port on the virtual IP. > >> > >> Best, > >> Malte > >> > >> > >> On 25.03.25 13:46, Eugen Block wrote: > >> > Yeah, it seems to work without the "keepalive-only" flag, at least > from > >> > a first test. So keepalive-only is not working properly, it seems? > >> > Should I create a tracker for that or am I misunderstanding its > purpose? > >> > > >> > Zitat von Malte Stroem <malte.str...@gmail.com>: > >> > > >> >> Hi Eugen, > >> >> > >> >> try omitting > >> >> > >> >> --ingress-mode keepalive-only > >> >> > >> >> like this > >> >> > >> >> ceph nfs cluster create ebl-nfs-cephfs "1 ceph01 ceph02 ceph03" -- > >> >> ingress --virtual_ip "192.168.168.114/24" > >> >> > >> >> Best, > >> >> Malte > >> >> > >> >> On 25.03.25 13:25, Eugen Block wrote: > >> >>> Thanks for your quick response. The specs I pasted are actually the > >> >>> result of deploying a nfs cluster like this: > >> >>> > >> >>> ceph nfs cluster create ebl-nfs-cephfs "1 ceph01 ceph02 ceph03" -- > >> >>> ingress --virtual_ip 192.168.168.114 --ingress-mode keepalive-only > >> >>> > >> >>> I can try redeploying it via dashboard, but I don't have a lot of > >> >>> confidence that it will work differently with a failover. > >> >>> > >> >>> Zitat von Malte Stroem <malte.str...@gmail.com>: > >> >>> > >> >>>> Hi Eugen, > >> >>>> > >> >>>> try deploying the NFS service like this: > >> >>>> > >> >>>> https://docs.ceph.com/en/latest/mgr/nfs/ > >> >>>> > >> >>>> Some had only success deploying it via the dashboard. > >> >>>> > >> >>>> Best, > >> >>>> Malte > >> >>>> > >> >>>> On 25.03.25 13:02, Eugen Block wrote: > >> >>>>> Hi, > >> >>>>> > >> >>>>> I'm re-evaluating NFS again, testing on a virtual cluster with > >> >>>>> 18.2.4. For now, I don't need haproxy so I use "keepalive_only: > >> >>>>> true" as described in the docs [0]. I first create the ingress > >> >>>>> service, wait for it to start, then create the nfs cluster. I've > >> >>>>> added the specs at the bottom. > >> >>>>> > >> >>>>> I can mount the export with the virtual ip. Then I just shut down > >> >>>>> the VM where the nfs service was running, the orchestrator > >> >>>>> successfully starts a nfs daemon elsewhere, but the keepalive > >> >>>>> daemon is not failed over. So mounting or accessing the export is > >> >>>>> impossible, of course. And after I power up the offline host > again, > >> >>>>> nothing is "repaired", keepalive and nfs run on different servers > >> >>>>> until I intervene manually. This doesn't seem to work as expected, > >> >>>>> is this a known issue (couldn't find anything on tracker)? I have > >> >>>>> my doubts, but maybe it works better with haproxy? Or am I missing > >> >>>>> something in my configuration? > >> >>>>> I haven't tried with a newer release yet. I'd appreciate any > >> comments. > >> >>>>> > >> >>>>> Thanks, > >> >>>>> Eugen > >> >>>>> > >> >>>>> ---snip--- > >> >>>>> service_type: ingress > >> >>>>> service_id: nfs.ebl-nfs-cephfs > >> >>>>> service_name: ingress.nfs.ebl-nfs-cephfs > >> >>>>> placement: > >> >>>>> count: 1 > >> >>>>> hosts: > >> >>>>> - ceph01 > >> >>>>> - ceph02 > >> >>>>> - ceph03 > >> >>>>> spec: > >> >>>>> backend_service: nfs.ebl-nfs-cephfs > >> >>>>> first_virtual_router_id: 50 > >> >>>>> keepalive_only: true > >> >>>>> monitor_port: 9049 > >> >>>>> virtual_ip: 192.168.168.114/24 > >> >>>>> > >> >>>>> > >> >>>>> service_type: nfs > >> >>>>> service_id: ebl-nfs-cephfs > >> >>>>> service_name: nfs.ebl-nfs-cephfs > >> >>>>> placement: > >> >>>>> count: 1 > >> >>>>> hosts: > >> >>>>> - ceph01 > >> >>>>> - ceph02 > >> >>>>> - ceph03 > >> >>>>> spec: > >> >>>>> port: 2049 > >> >>>>> virtual_ip: 192.168.168.114 > >> >>>>> ---snip--- > >> >>>>> > >> >>>>> [0] https://docs.ceph.com/en/reef/cephadm/services/nfs/#nfs-with- > >> >>>>> virtual-ip-but-no-haproxy > >> >>>>> _______________________________________________ > >> >>>>> ceph-users mailing list -- ceph-users@ceph.io > >> >>>>> To unsubscribe send an email to ceph-users-le...@ceph.io > >> >>> > >> >>> > >> >>> > >> > > >> > > >> > > >> _______________________________________________ > >> ceph-users mailing list -- ceph-users@ceph.io > >> To unsubscribe send an email to ceph-users-le...@ceph.io > >> > > > > _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io