> Is there a way to set the lock check interval of
org.apache.activemq.artemis.core.server.impl.FileLockNodeManager?

No.

> It seems like it checks several times a second according to the debug
logs and the source code.

It's hard-coded to run every 2 seconds. See here [1].

Where do you see in the source code that it will check "several times a
second"?


Justin

[1]
https://github.com/apache/activemq-artemis/blob/bfb408f5905ad9ef5f0375d466894c276b09cca5/artemis-server/src/main/java/org/apache/activemq/artemis/core/server/impl/FileLockNodeManager.java#L60

On Thu, May 1, 2025 at 2:26 PM William Crowell
<wcrow...@perforce.com.invalid> wrote:

> Justin,
>
> AWS is using NFSv4.2 (and also tested 4.1).  The mount options I am using
> on AWS which work fine:
>
> 1.1.1.2:/data on /data type nfs4
> (rw,relatime,sync,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,acregmin=0,acregmax=0,acdirmin=0,acdirmax=0,soft,noac,proto=tcp,timeo=50,retrans=3,sec=sys,clientaddr=1.1.1.3,lookupcache=none,local_lock=none,addr=1.1.1.2)
>
> The NFS mount options on environment that I am having the problem with:
>
>
> vers=4.1,defaults,lazytime,noatime,nodiratime,rsize=1048576,wsize=1048576,sync,intr,noac
>
> Is there a way to set the lock check interval of
> org.apache.activemq.artemis.core.server.impl.FileLockNodeManager?  It seems
> like it checks several times a second according to the debug logs and the
> source code.
>
> This is some configuration issue or bug with NFS or network related issue
> as you stated.
>
> Regards,
>
> William Crowell
>
> From: Justin Bertram <jbert...@apache.org>
> Date: Thursday, May 1, 2025 at 2:27 PM
> To: users@activemq.apache.org <users@activemq.apache.org>
> Subject: Re: Apache Artemis 2.40.0: Strange File Locking Behavior On NFSv4
> > Everything works fine on AWS, and I cannot reproduce the issue.
>
> This part is clear as you explained it previously. What isn't clear is
> whether AWS is using NFS and if so what mount options it is using and how
> those mount options differ from what you're using in the environment with
> the problem.
>
> It's also not clear exactly what mount options you've tried in the
> environment with the problem.
>
> > The issue is that the backup is taking away the lock from the primary
> when it is not supposed to.  It’s like the backup host gets impatient and
> just grabs the lock.
>
> I'm not sure this is an accurate way to describe what's happening. Here's
> what's happening behind the scenes with the broker...
>
> The primary and the backup rely on an exclusive file lock from the
> underlying filesystem implementation to prevent running at the same time.
> When the primary activates it acquires this lock and when the backup
> activates it attempts to acquire it as well. Since the primary already has
> it the backup can't acquire it. The backup then pauses activation and polls
> the lock attempting to acquire it. Once it acquires the lock it completes
> its activation process which involves updating the timestamp of the lock
> file. Meanwhile the primary who originally acquired the lock monitors the
> integrity of the lock to ensure it hasn't lost it. When the backup acquires
> the lock updates the timestamp of the lock file the primary notices and
> realizes that it has lost the lock and shuts itself down to avoid
> split-brain.
>
> The issue here isn't that the backup got impatient and stole the lock from
> the primary. The issue is that the file-system allowed the backup to
> acquire the lock, and that reason is not clear. Perhaps there was some kind
> of network failure between the primary and the NFS mount that caused the
> lock to fail. Perhaps there's an NFS misconfiguration or even an NFS bug
> that caused it. Perhaps the issue is in a virtualization layer (assuming
> that applies). Perhaps it's because all the relevant clocks aren't
> synchronized.
>
> > Sent you the logs directly.
>
> Thanks for that. I reviewed the logs and the broker appears to be working
> as designed. There's no smoking gun. For some reason the filesystem simply
> allows the backup to acquire the lock after a few minutes of denying it.
>
> At this point I don't see any evidence that the broker could be configured
> any differently to prevent this issue.
>
>
> Justin
>
> On Thu, May 1, 2025 at 12:04 PM William Crowell
> <wcrow...@perforce.com.invalid> wrote:
>
> > Justin,
> >
> > Everything works fine on AWS, and I cannot reproduce the issue.  I am
> > working with someone who has on-premise hosts.
> >
> > The issue is that the backup is taking away the lock from the primary
> when
> > it is not supposed to.  It’s like the backup host gets impatient and just
> > grabs the lock.  Sent you the logs directly.
> >
> > All options are valid in that link except for intr which was removed.  I
> > am verifying the Linux kernel build version, but it should be very
> recent.
> >
> > Regards,
> >
> > William Crowell
> >
> > From: Justin Bertram <jbert...@apache.org>
> > Date: Thursday, May 1, 2025 at 12:57 PM
> > To: users@activemq.apache.org <users@activemq.apache.org>
> > Subject: Re: Apache Artemis 2.40.0: Strange File Locking Behavior On
> NFSv4
> > > Those option listed in the link regarding NFS that will work is sync.
> >
> > I'm not following you here. Which exact options are you referring to?
> >
> > > intr was removed and ignored in later versions of the Linux kernel.
> >
> > That's correct. It remains in the recommended settings because we don't
> > know what version of the Linux kernel folks are using.
> >
> > > I did try mounting the 2 brokers are mounting using these options (on
> > AWS)...I was not able to reproduce the issue on AWS.
> >
> > I'm lost here. Can you clarify exactly what is and isn't working on AWS
> and
> > in your problematic environment?
> >
> >
> > Justin
> >
> > On Thu, May 1, 2025 at 11:36 AM William Crowell
> > <wcrow...@perforce.com.invalid> wrote:
> >
> > > Justin,
> > >
> > > Yes, already ahead of you on this.
> > >
> > > We are also checking if the clocks are in sync between the brokers and
> > the
> > > NFS share.  I know you can have some drift in clock between the
> servers,
> > > but it cannot be very large.
> > >
> > > Those option listed in the link regarding NFS that will work is sync.
> > > intr was removed and ignored in later versions of the Linux kernel.
> > >
> > > I did try mounting the 2 brokers are mounting using these options (on
> > AWS):
> > >
> > >
> > > sudo mount -o
> > > vers=4.1,soft,sync,intr,noac,lookupcache=none,timeo=50,retrans=3  -t
> nfs
> > > 1.1.1.4:/data /data
> > >
> > > I was not able to reproduce the issue on AWS.
> > >
> > > If you want the full logs, then I can upload those to gist or
> something.
> > > Working on this.
> > >
> > > This is what I am seeing on the other system having the problem.  Here
> is
> > > the primary:
> > >
> > > …
> > > 2025-05-01 15:07:15,441 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] Lock
> > > appears to be valid; double check by reading status
> > > 2025-05-01 15:07:15,441 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager]
> > getting
> > > state...
> > > 2025-05-01 15:07:15,441 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager]
> trying
> > > to lock position: 0
> > > 2025-05-01 15:07:15,441 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager]
> locked
> > > position: 0
> > > 2025-05-01 15:07:15,441 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager]
> lock:
> > > sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]
> > > 2025-05-01 15:07:15,442 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager]
> > state: L
> > > (THIS MEANS ACTIVE)
> > > 2025-05-01 15:07:15,442 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] Lock
> > > appears to be valid; triple check by comparing timestamp
> > > 2025-05-01 15:07:15,444 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] Lock
> > > file /data/server.lock originally locked at
> 2025-05-01T15:06:16.787+0000
> > > was modified at 2025-05-01T15:07:05.184+0000
> > > 2025-05-01 15:07:15,445 WARN
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager] Lost
> > the
> > > lock according to the monitor, notifying listeners
> > > 2025-05-01 15:07:15,445 ERROR [org.apache.activemq.artemis.core.server]
> > > AMQ222010: Critical IO Error, shutting down the server. file=Lost
> > > NodeManager lock, message=NULL
> > > java.io.IOException: lost lock
> > > …
> > >
> > > On the backup it tries once a second to get the lock and can’t which is
> > > good, but something finally happens at 3:07:05pm, and it mistakenly
> > thinks
> > > it can get the lock:
> > >
> > > …
> > > 2025-05-01 15:07:05,183 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager]
> locked
> > > position: 0
> > > 2025-05-01 15:07:05,183 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager]
> lock:
> > > sun.nio.ch.FileLockImpl[0:9223372036854775807 exclusive valid]
> > > 2025-05-01 15:07:05,183 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager]
> > state: L
> > > (THIS MEANS ACTIVE)
> > > 2025-05-01 15:07:05,184 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager]
> > acquired
> > > primary node lock state = L
> > > 2025-05-01 15:07:05,186 DEBUG
> > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager]
> > touched
> > > /data/server.lock; new time: 1746112025184
> > > …
> > >
> > > Regards,
> > >
> > > William Crowell
> > >
> > > From: Justin Bertram <jbert...@apache.org>
> > > Date: Thursday, May 1, 2025 at 11:44 AM
> > > To: users@activemq.apache.org <users@activemq.apache.org>
> > > Subject: Re: Apache Artemis 2.40.0: Strange File Locking Behavior On
> > NFSv4
> > > Would you mind turning on DEBUG logging for
> > > org.apache.activemq.artemis.core.server.impl.FileLockNodeManager [1],
> > > reproducing, and uploading the full logs someplace accessible?
> > >
> > > Regarding your NFS mount options...I was perusing the NFS man page [2]
> > and
> > > saw this note about noatime & nodiratime:
> > >
> > >     In particular, the atime/noatime, diratime/nodiratime,
> > > relatime/norelatime, and strictatime/nostrictatime mount options have
> no
> > > effect on NFS mounts.
> > >
> > > I could not find any reference to lazytime, and I also noticed that you
> > > weren't using all the recommendations from the ActiveMQ Artemis
> > > documentation. I wonder if you might try something like this:
> > >
> > >    vers=4.1,soft,sync,intr,noac,lookupcache=none,timeo=50,retrans=3
> > >
> > > Is NFS being used in your AWS use-case? If so, what mount options are
> > being
> > > used?
> > >
> > > To be clear, I'm not an NFS expert by any means. Usually this stuff
> just
> > > works with the recommended settings.
> > >
> > >
> > > Justin
> > >
> > > [1]
> > >
> > >
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Factivemq.apache.org%2Fcomponents%2Fartemis%2Fdocumentation%2Flatest%2Flogging.html%23configuring-a-specific-level-for-a-logger&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394248230%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=9jgPvW9IQ3tuuUbqkS8NfpkQPaGMcyF3vc%2BwyN%2FQW5I%3D&reserved=0
> <
> https://activemq.apache.org/components/artemis/documentation/latest/logging.html#configuring-a-specific-level-for-a-logger
> >
> > <
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Factivemq.apache.org%2Fcomponents%2Fartemis%2Fdocumentation%2Flatest%2Flogging.html%23configuring-a-specific-level-for-a-logger&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394281322%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=48fvH5SbLnmKB%2FssyYUTzm8dF4jy0pzkJjdu8JFC%2BpI%3D&reserved=0
> <
> https://activemq.apache.org/components/artemis/documentation/latest/logging.html#configuring-a-specific-level-for-a-logger
> >
> > >
> > > <
> > >
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Factivemq.apache.org%2Fcomponents%2Fartemis%2Fdocumentation%2Flatest%2Flogging.html%23configuring-a-specific-level-for-a-logger&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394298696%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=R58mmMBd2x%2Bk0Ki47i87U%2FKKFhBG8Eluh3jGyAhtGng%3D&reserved=0
> <
> https://activemq.apache.org/components/artemis/documentation/latest/logging.html#configuring-a-specific-level-for-a-logger
> >
> > <
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Factivemq.apache.org%2Fcomponents%2Fartemis%2Fdocumentation%2Flatest%2Flogging.html%23configuring-a-specific-level-for-a-logger&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394315439%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=U4ZGfHpNVSPsXfJ448mqjHkjGY94c5Isnc2yxrDk984%3D&reserved=0
> <
> https://activemq.apache.org/components/artemis/documentation/latest/logging.html#configuring-a-specific-level-for-a-logger
> >
> > >
> > > >
> > > [2]
> > >
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.man7.org%2Flinux%2Fman-pages%2Fman5%2Fnfs.5.html&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394331789%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=pVxCfd0L%2BGcT%2B9bJQmBueUlKR9Uo9ltJbHvlQldQs7c%3D&reserved=0
> <https://www.man7.org/linux/man-pages/man5/nfs.5.html>
> > <
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.man7.org%2Flinux%2Fman-pages%2Fman5%2Fnfs.5.html&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394348725%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=XDRT%2BEhzw0y7b4LpyXCmKzVa38fOCcfJhzcolB18mbI%3D&reserved=0
> <https://www.man7.org/linux/man-pages/man5/nfs.5.html>>
> > > <
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.man7.org%2Flinux%2Fman-pages%2Fman5%2Fnfs.5.html&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394366667%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=LSoz3pLKYdl917Xf9o95DV6QxxvcHRwhaSbK2vc5%2BhM%3D&reserved=0
> <https://www.man7.org/linux/man-pages/man5/nfs.5.html>
> > <
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.man7.org%2Flinux%2Fman-pages%2Fman5%2Fnfs.5.html&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394383244%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=vKuNDBBYV0f2OjhnzsBdgVfQWJ2O7ieNTmCL%2BC%2FEBl4%3D&reserved=0
> <https://www.man7.org/linux/man-pages/man5/nfs.5.html>>>
> > >
> > > On Thu, May 1, 2025 at 9:02 AM William Crowell
> > > <wcrow...@perforce.com.invalid> wrote:
> > >
> > > > Configuration and logs…
> > > >
> > > > The relevant broker.xml configuration from the primary (broker 1) is:
> > > >
> > > > …
> > > >       <connectors>
> > > >          <connector name="broker1">tcp://1.1.1.2:61616</connector>
> > > >          <connector name="broker2">tcp://1.1.1.3:61616</connector>
> > > >       </connectors>
> > > >
> > > >       <cluster-connections>
> > > >          <cluster-connection name="my-cluster">
> > > >             <connector-ref>broker1</connector-ref>
> > > >             <static-connectors>
> > > >                <connector-ref>broker2</connector-ref>
> > > >             </static-connectors>
> > > >          </cluster-connection>
> > > >       </cluster-connections>
> > > >
> > > >       <ha-policy>
> > > >          <shared-store>
> > > >             <primary>
> > > >                <failover-on-shutdown>true</failover-on-shutdown>
> > > >             </primary>
> > > >          </shared-store>
> > > >       </ha-policy>
> > > > …
> > > >
> > > > The relevant broker.xml configuration from the backup (broker 2) is:
> > > >
> > > > …
> > > >       <connectors>
> > > >          <connector name="broker1">tcp://1.1.1.2:61616</connector>
> > > >          <connector name="broker2">tcp://1.1.1.3:61616</connector>
> > > >       </connectors>
> > > >
> > > >       <cluster-connections>
> > > >          <cluster-connection name="my-cluster">
> > > >             <connector-ref>broker2</connector-ref>
> > > >             <static-connectors>
> > > >                <connector-ref>broker1</connector-ref>
> > > >             </static-connectors>
> > > >          </cluster-connection>
> > > >       </cluster-connections>
> > > >
> > > >       <ha-policy>
> > > >          <shared-store>
> > > >             <backup>
> > > >                <allow-failback>false</allow-failback>
> > > >             </backup>
> > > >          </shared-store>
> > > >       </ha-policy>
> > > > …
> > > >
> > > > Startup on the primary:
> > > >
> > > > …
> > > > 2025-05-01 12:51:23,567 INFO
> [org.apache.activemq.artemis.core.server]
> > > > AMQ221006: Waiting to obtain primary lock
> > > > …
> > > > 2025-05-01 12:51:23,747 INFO
> [org.apache.activemq.artemis.core.server]
> > > > AMQ221034: Waiting indefinitely to obtain primary lock
> > > > 2025-05-01 12:51:23,748 INFO
> [org.apache.activemq.artemis.core.server]
> > > > AMQ221035: Primary Server Obtained primary lock
> > > > …
> > > > 2025-05-01 12:51:24,289 INFO
> [org.apache.activemq.artemis.core.server]
> > > > AMQ221007: Server is now active
> > > > …
> > > >
> > > > Backup is started and somehow becomes primary:
> > > >
> > > > …
> > > > 2025-05-01 12:51:48,473 INFO
> [org.apache.activemq.artemis.core.server]
> > > > AMQ221032: Waiting to become backup node
> > > > 2025-05-01 12:51:48,474 INFO
> [org.apache.activemq.artemis.core.server]
> > > > AMQ221033: ** got backup lock
> > > > …
> > > > 2025-05-01 12:51:48,659 INFO
> [org.apache.activemq.artemis.core.server]
> > > > AMQ221109: Apache ActiveMQ Artemis Backup Server version 2.40.0
> > > > [339308e1-25f3-11f0-996a-0200ec1b9c8e] started; waiting for primary
> to
> > > fail
> > > > before activating
> > > > 2025-05-01 12:51:48,809 INFO
> [org.apache.activemq.artemis.core.server]
> > > > AMQ221031: backup announced
> > > > …
> > > > 2025-05-01 12:52:06,129 INFO
> [org.apache.activemq.artemis.core.server]
> > > > AMQ221010: Backup Server is now active
> > > > …
> > > > 2025-05-01 12:52:07,158 INFO
> [org.apache.activemq.artemis.core.client]
> > > > AMQ214036: Connection closure to 1.1.1.2/1.1.1.2:61616 has been
> > > detected:
> > > > AMQ219015: The connection was disconnected because of server shutdown
> > > > [code=DISCONNECTED]
> > > > …
> > > >
> > > > Primary loses the lock unexpectedly and shuts down:
> > > >
> > > > …
> > > > 2025-05-01 12:52:16,352 WARN
> > > > [org.apache.activemq.artemis.core.server.impl.FileLockNodeManager]
> Lost
> > > the
> > > > lock according to the monitor, notifying listeners
> > > > 2025-05-01 12:52:16,353 ERROR
> [org.apache.activemq.artemis.core.server]
> > > > AMQ222010: Critical IO Error, shutting down the server. file=Lost
> > > > NodeManager lock, message=NULL
> > > > java.io.IOException: lost lock
> > > >         at
> > > >
> > >
> >
> org.apache.activemq.artemis.core.server.impl.SharedStorePrimaryActivation.lambda$registerActiveLockListener$0(SharedStorePrimaryActivation.java:124)
> > > > ~[artemis-server-2.40.0.jar:2.40.0]
> > > >         at
> > > >
> > >
> >
> org.apache.activemq.artemis.core.server.NodeManager.lambda$notifyLostLock$0(NodeManager.java:167)
> > > > ~[artemis-server-2.40.0.jar:2.40.0]
> > > > …
> > > > 2025-05-01 12:52:16,528 INFO
> [org.apache.activemq.artemis.core.server]
> > > > AMQ221002: Apache ActiveMQ Artemis Message Broker version 2.40.0
> > > > [339308e1-25f3-11f0-996a-0200ec1b9c8e] stopped, uptime 52.978 seconds
> > > > …
> > > >
> > > > Regards,
> > > >
> > > > William Crowell
> > > >
> > > > From: William Crowell <wcrow...@perforce.com.INVALID>
> > > > Date: Thursday, May 1, 2025 at 9:20 AM
> > > > To: users@activemq.apache.org <users@activemq.apache.org>
> > > > Subject: Apache Artemis 2.40.0: Strange File Locking Behavior On
> NFSv4
> > > > Good morning,
> > > >
> > > > Disclaimer: This is not a bug, but a configuration issue.
> > > >
> > > > We are using Apache Artemis 2.40.0 on Rocky Linux 9.  We are
> > configuring
> > > a
> > > > primary/backup pair on separate hosts and putting the data directory
> on
> > > an
> > > > NSFv4 mount, and we are experiencing problems with the locking
> > mechanism.
> > > > I do know that NFS is not recommended for production use, but that is
> > > what
> > > > we are limited to.
> > > >
> > > > We are following this documentation:
> > > >
> > >
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Factivemq.apache.org%2Fcomponents%2Fartemis%2Fdocumentation%2Flatest%2Fha.html%23nfs-mount-recommendations&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394399218%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=XJVBL%2Fxxal5A8KFWuStcgaTt4EyMwp5dDhw9NTXRo%2F0%3D&reserved=0
> <
> https://activemq.apache.org/components/artemis/documentation/latest/ha.html#nfs-mount-recommendations
> >
> > <
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Factivemq.apache.org%2Fcomponents%2Fartemis%2Fdocumentation%2Flatest%2Fha.html%23nfs-mount-recommendations&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394412442%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=KQjHVR%2F0Gj2urG916oXvVuOEh4aEQIrla6OjfMI2tK4%3D&reserved=0
> <
> https://activemq.apache.org/components/artemis/documentation/latest/ha.html#nfs-mount-recommendations
> >
> > >
> > > <
> > >
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Factivemq.apache.org%2Fcomponents%2Fartemis%2Fdocumentation%2Flatest%2Fha.html%23nfs-mount-recommendations&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394424413%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=SombW%2FC5wBMSEFVgiSHNu4Nl7HIm0o0GT8%2BYSCj7N%2BE%3D&reserved=0
> <
> https://activemq.apache.org/components/artemis/documentation/latest/ha.html#nfs-mount-recommendations
> >
> > <
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Factivemq.apache.org%2Fcomponents%2Fartemis%2Fdocumentation%2Flatest%2Fha.html%23nfs-mount-recommendations&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394437513%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=ululax1fprmjZL4ABNIh9KTDGA2boKX1LC%2BLEx8Ty%2Bk%3D&reserved=0
> <
> https://activemq.apache.org/components/artemis/documentation/latest/ha.html#nfs-mount-recommendations
> >
> > >
> > > >
> > > > <
> > > >
> > >
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Factivemq.apache.org%2Fcomponents%2Fartemis%2Fdocumentation%2Flatest%2Fha.html%23nfs-mount-recommendations&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394452178%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=Z23U8JOGGLafbxiqw80ymkmG7EYh4pKadj4HyfURKgU%3D&reserved=0
> <
> https://activemq.apache.org/components/artemis/documentation/latest/ha.html#nfs-mount-recommendations
> >
> > <
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Factivemq.apache.org%2Fcomponents%2Fartemis%2Fdocumentation%2Flatest%2Fha.html%23nfs-mount-recommendations&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394467087%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=%2FMRL4hIe8r7lSUeaQEMK8dNBILQ1haqc7xBrsHvmDSc%3D&reserved=0
> <
> https://activemq.apache.org/components/artemis/documentation/latest/ha.html#nfs-mount-recommendations
> >
> > >
> > > <
> > >
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Factivemq.apache.org%2Fcomponents%2Fartemis%2Fdocumentation%2Flatest%2Fha.html%23nfs-mount-recommendations&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394482811%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=OjPvQzD9SE3x4K5NCy6hY5ytYzbZqwlUhdZgwGaA3ag%3D&reserved=0
> <
> https://activemq.apache.org/components/artemis/documentation/latest/ha.html#nfs-mount-recommendations
> >
> > <
> >
> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Factivemq.apache.org%2Fcomponents%2Fartemis%2Fdocumentation%2Flatest%2Fha.html%23nfs-mount-recommendations&data=05%7C02%7CWCrowell%40perforce.com%7C15555471b7064329015608dd88ddcab2%7C95b666d19a7549ab95a38969fbcdc08c%7C0%7C0%7C638817208394497400%7CUnknown%7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ%3D%3D%7C0%7C%7C%7C&sdata=09SPXjmedKNdokWDMFIxtCAIYmpuvmgycChSpxEM5Gs%3D&reserved=0
> <
> https://activemq.apache.org/components/artemis/documentation/latest/ha.html#nfs-mount-recommendations
> >
> > >
> > > >
> > > > >
> > > >
> > > > What is happening is that the primary loses the lock and goes down
> > after
> > > > the backup node was started.  The mount options on both brokers we
> are
> > > > using are:
> > > >
> > > >
> > > >
> > >
> >
> vers=4.1,defaults,lazytime,noatime,nodiratime,rsize=1048576,wsize=1048576,sync,intr,noac
> > > >
> > > > We then tried to startup the nodes sequentially.  The primary lost
> the
> > > > lock and went down shortly after the backup node was started.
> > > >
> > > > We also tested file locking on brokers 1 and 2:
> > > >
> > > > Broker 1:
> > > >
> > > > $ date; flock -x /data/test.lock  -c "sleep 30"; echo $?; date
> > > > Thu May  1 12:42:52 PM GMT 2025
> > > > 0
> > > > Thu May  1 12:43:22 PM GMT 2025
> > > >
> > > > Broker 2:
> > > >
> > > > date; flock -n /data/test.lock  -c "echo lock acquired"; echo $?;
> date
> > > > Thu May  1 12:42:46 PM GMT 2025
> > > > 1
> > > > Thu May  1 12:42:47 PM GMT 2025
> > > >
> > > > This means that broker 2 was unable to acquire the lock because
> broker
> > 1
> > > > already had it which is not consistent with the behavior on the
> Apache
> > > > Artemis brokers.  I also tested this on AWS, and the failover works
> > fine
> > > as
> > > > expected.
> > > >
> > > > What am I missing here?
> > > >
> > > > Regards,
> > > >
> > > > William Crowell
> > > >
> > > >
> > > >
> > > > This e-mail may contain information that is privileged or
> confidential.
> > > If
> > > > you are not the intended recipient, please delete the e-mail and any
> > > > attachments and notify us immediately.
> > > >
> > > >
> > > > This e-mail may contain information that is privileged or
> confidential.
> > > If
> > > > you are not the intended recipient, please delete the e-mail and any
> > > > attachments and notify us immediately.
> > > >
> > > >
> > >
> > >
> > > CAUTION: This email originated from outside of the organization. Do not
> > > click on links or open attachments unless you recognize the sender and
> > know
> > > the content is safe.
> > >
> > >
> > > This e-mail may contain information that is privileged or confidential.
> > If
> > > you are not the intended recipient, please delete the e-mail and any
> > > attachments and notify us immediately.
> > >
> > >
> >
> >
> > CAUTION: This email originated from outside of the organization. Do not
> > click on links or open attachments unless you recognize the sender and
> know
> > the content is safe.
> >
> >
> > This e-mail may contain information that is privileged or confidential.
> If
> > you are not the intended recipient, please delete the e-mail and any
> > attachments and notify us immediately.
> >
> >
>
>
> CAUTION: This email originated from outside of the organization. Do not
> click on links or open attachments unless you recognize the sender and know
> the content is safe.
>
>
> This e-mail may contain information that is privileged or confidential. If
> you are not the intended recipient, please delete the e-mail and any
> attachments and notify us immediately.
>
>

Reply via email to