Hello. I am trying to setup an highly available active/active NFS server with two Debian boxes and pacemaker, corosync, ocfs2 and drbd. The pacemaker cluster manages the drbd dual primary configuration, DLM and O2CB daemons and the mount / umount of the shared filesystems. I have the nfs server process starting at boot with the debian init.d scripts with an empty /etc/exports file. The exports are added/removed with the recent exportfs RA. Active/active configuration is obtained by having each server share a different set of filesystems, and thus the nfs server processes are always active on both nodes. The cluster is able to correctly migrate the exports from a node to the other but the nfs state is not kept in synch between the two nodes. This cause any already connected client to hang when there is a failover of the resource. The client node correctly resume the operations when the exportfs is migrated back to the original node. Manually synchronizing the /var/lib/nfs/rmtab file on both servers seems to solve this problem and now the client node works fine both when the exportfs is migrated to the other node and when it is taken back to the original one. It hangs just for a few seconds during the migration.
I have tried to put /var/lib/nfs directory on an OCFS2 filesystem shared by both nodes but I had a lot of stability problems with the nfs server processes. In particular they often seems to hang while starting or even stopping. I think this could be because they may be locking some files on the shared filesystems. As the file are kept locked by the daemons, further lock operation may be blocked indefinitely. Then I tried to find a way to keep just the rmtab file synchronized on both nodes. I cannot find a way to have pacemaker do this for me. Is there one? Also, I have found that the exportfs RA originally had a mechanism to keep rmtab synchronized but it has been removed in this commit: https://github.com/ClusterLabs/resource-agents/commit/0edb009a87f0d47b310998f2cb3809d2775e2de8 Is there another way to accomplish this active/active setup? Thanks! Alessandro _______________________________________________ Linux-HA mailing list [email protected] http://lists.linux-ha.org/mailman/listinfo/linux-ha See also: http://linux-ha.org/ReportingProblems
