Apache Artemis 2.40.0: Strange File Locking Behavior On NFSv4

William Crowell Thu, 01 May 2025 06:22:36 -0700

Good morning,

Disclaimer: This is not a bug, but a configuration issue.

We are using Apache Artemis 2.40.0 on Rocky Linux 9. We are configuring a
primary/backup pair on separate hosts and putting the data directory on an
NSFv4 mount, and we are experiencing problems with the locking mechanism. I do
know that NFS is not recommended for production use, but that is what we are
limited to.

We are following this documentation:
https://activemq.apache.org/components/artemis/documentation/latest/ha.html#nfs-mount-recommendations

What is happening is that the primary loses the lock and goes down after the
backup node was started. The mount options on both brokers we are using are:

vers=4.1,defaults,lazytime,noatime,nodiratime,rsize=1048576,wsize=1048576,sync,intr,noac

We then tried to startup the nodes sequentially. The primary lost the lock and
went down shortly after the backup node was started.

We also tested file locking on brokers 1 and 2:

Broker 1:

$ date; flock -x /data/test.lock -c "sleep 30"; echo $?; date
Thu May 1 12:42:52 PM GMT 2025
0
Thu May 1 12:43:22 PM GMT 2025

Broker 2:

date; flock -n /data/test.lock -c "echo lock acquired"; echo $?; date
Thu May 1 12:42:46 PM GMT 2025
1
Thu May 1 12:42:47 PM GMT 2025

This means that broker 2 was unable to acquire the lock because broker 1
already had it which is not consistent with the behavior on the Apache Artemis
brokers. I also tested this on AWS, and the failover works fine as expected.

What am I missing here?

Regards,

William Crowell

This e-mail may contain information that is privileged or confidential. If you
are not the intended recipient, please delete the e-mail and any attachments
and notify us immediately.

Apache Artemis 2.40.0: Strange File Locking Behavior On NFSv4

Reply via email to