Hi all,

I am trying to setup an active-active NFS Ganesha cluster (with two Ganeshas 
(v3.0) running in Docker containers). I could manage to get two Ganesha daemons 
running using the rados_cluster backend for active-active deployment. I have 
the grace db within the cephfs metadata pool in an own namespace which keeps 
track on the node status.
Now, I can mount the exposed filesystem over NFS (v4.1, v4.2) with both 
daemons. So far so good. __

Testing high availability resulted in an unexpected behavior for that I am not 
sure whether it is intentional or whether it is a configuration problem. 

Problem:
If both are running, no E or N flags are set within the grace db, as I expect. 
Once, one host goes down (or is taken down) ALL clients cannot read nor write 
to the mounted filesystem, even the clients which are not connected to dead 
ganesha. In the db, I see that the dead ganesha has state NE and the active has 
E. This state is what I expect from the Ganesha documentation. Nevertheless, I 
would assume that the clients connected to the active daemon are not blocked. 
This state is not cleaned up by itself (e.g. after the grace period).
I can unlock this situation by 'lifting' the dead node with a direct db call 
(using ganesha-rados-grace tool). But within an active-active deployment this 
is not suitable. 

The ganesha config looks like:

------------
NFS_CORE_PARAM
{
        Enable_NLM = false;
        Protocols = 4;
}
NFSv4
{
        RecoveryBackend = rados_cluster;
        Minor_Versions =  1,2;
}
RADOS_KV
{
    pool = "cephfsmetadata";
    nodeid = "a" ;
    namespace = "grace";
    UserId = "ganesha";
    Ceph_Conf = "/etc/ceph/ceph.conf";
}
MDCACHE {
        Dir_Chunk = 0;
        NParts = 1;
        Cache_Size = 1;
}
EXPORT
{
        Export_ID=101;
        Protocols = 4;
        Transports = TCP;
        Path = PATH;
        Pseudo = PSEUDO_PATH;
        Access_Type = RW;
        Attr_Expiration_Time = 0;
        Squash = no_root_squash;

        FSAL {
                Name = CEPH;
                User_Id = "ganesha";
                Secret_Access_Key = CEPHXKEY;
        }
}
LOG {
        Default_Log_Level = "FULL_DEBUG";
}
------------

Does anyone have similar problems? Or if this behavior is by purpose, can you 
explain to me why this is the case? 
Thank you in advance for your time and thoughts. 

Kind regards,
Michael

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to