After looking at this a bit longer I believe I see what's happening. Section 3.1.4 of the MQTT 5 specification states:
> If the ClientID represents a Client already connected to the Server, the Server sends a DISCONNECT > packet to the existing Client with Reason Code of 0x8E (Session taken over) as described in section 4.13 and MUST > close the Network Connection of the existing Client. There's a similar statement in the same section of the 3.1.1 specification. In a cluster like you have configured this "link stealing" works, and it is relatively straightforward to implement. The node receiving the connection simply sends a notification to all the other cluster members about the client ID and if there is a connection on any of those nodes using that client ID then it gets closed. However, you've configured the opposite behavior (i.e. allowLinkStealing=false) which means that instead of kicking off the existing client the incoming client is denied. This is actually a much harder problem to solve in a cluster because it requires not only a notification about the incoming client but a *response* from all the other nodes in the cluster indicating whether or not a client with that same client ID is already connected. This kind of data exchange between nodes is at odds with the scalability that a cluster is designed to provide and can't/won't be solved in the same way. The way to solve this problem is to use a connection router [1] to ensure that clients using the same ID always get connected to the same node in the cluster. Based on your description you already see that when the clients connect to the same node then denying the incoming connection works as expected. Using a connection router just ensures that happens all the time. I'll add a note to the documentation to make this more clear. Justin [1] https://activemq.apache.org/components/artemis/documentation/latest/connection-routers.html#connection-routers On Mon, Dec 18, 2023 at 12:01 PM Justin Bertram <jbert...@apache.org> wrote: > Can you work up a reproducer that doesn't involve that operator? > > For what it's worth, the terminology you used in your description seems > fundamentally ambiguous. You talk about "HA mode", "replicas", etc. This > terminology has a specific meaning in ActiveMQ Artemis and apparently a > different meaning in Kubernetes. For example, in ActiveMQ Artemis HA is > supported via active/passive broker pairs. However, from what I can tell, > HA in Kubernetes is just multiple "pods" running the same configuration - > something that would generally be referred to just as a "cluster" in > ActiveMQ Artemis. Therefore, when you use these terms in describing your > use-case it gets confusing about what the actual broker configuration is - > which is mainly what we (on this list) care about. > > > Justin > > On Mon, Dec 18, 2023 at 5:33 AM andrea bisogno <bisoma...@hotmail.it> > wrote: > >> Hi team, >> >> I'm facing some unexpected MQTT stealing link issues with Artemis >> deployed on Kubernetes in High Availability (i.e. with the broker pods >> number >= 2). >> >> I've described the test scenario, and the corresponding unexpected >> behavior, here: >> https://github.com/artemiscloud/activemq-artemis-operator/discussions/756 >> >> Can you help me with this? >> >> Many thanks in advance, >> >> >> Andrea Bisogno >> >