Re: Artemis 2.4.0 - Issues with memory leaks and JMS message redistribution

Justin Bertram Fri, 23 Feb 2018 06:52:20 -0800

Couple of questions:

 - Do you have any consumers on the DLQ?
 - Are messages being sent to the DLQ by the broker automatically (e.g.
based on delivery attempt failures) or is that being done by your
application?
 - How are you setting the expiry delay?
 - Do you have a reproducible test-case?



Justin

On Fri, Feb 23, 2018 at 4:38 AM, Ilkka Virolainen <
ilkka.virolai...@bitwise.fi> wrote:

> I'm still facing an issue with somewhat confusing behavior regarding
> message expiration in the DLQ, maybe related to the memory issues I've been
> having. My aim is to have messages routed to DLQ expire and dropped in one
> hour. To achieve this, I've set an empty expiry-address and the appropriate
> expiry-delay. The problem is, most of the messages routed to DLQ end up in
> an in-delivery state - they are not expiring and I cannot remove them via
> JMX. Messagecount in the DLQ is slightly higher than the deliveringcount
> and attempting to remove all messages only removes a number of messages
> that is equal to the difference between deliveringcount and messagecount
> which is approximately a few thousand messages while the messagecount is
> tens of thousands and increasing as message delivery failures occur.
>
> What could be the reason for this behavior and how could it be avoided?
>
> -----Original Message-----
> From: Ilkka Virolainen [mailto:ilkka.virolai...@bitwise.fi]
> Sent: 22. helmikuuta 2018 13:38
> To: users@activemq.apache.org
> Subject: RE: Artemis 2.4.0 - Issues with memory leaks and JMS message
> redistribution
>
> To answer my own question in case anyone else is wondering about a similar
> issue, turns out the change in addressing is referred in ticket [1] and
> adding the multicastPrefix and anycastPrefix described in the ticket to my
> broker acceptors seems to have fixed my problem. If the issue regarding
> memory leaks persists I will try to provide a reproducible test case.
>
> Thank you for your help, Justin.
>
> Best regards,
> - Ilkka
>
> [1] https://issues.apache.org/jira/browse/ARTEMIS-1644
>
>
> -----Original Message-----
> From: Ilkka Virolainen [mailto:ilkka.virolai...@bitwise.fi]
> Sent: 22. helmikuuta 2018 12:33
> To: users@activemq.apache.org
> Subject: RE: Artemis 2.4.0 - Issues with memory leaks and JMS message
> redistribution
>
> Having removed the address configuration and having switched from 2.4.0 to
> yesterday's snapshot of 2.5.0 it seems like the redistribution of messages
> is now working, but there also seems to have been a change in addressing
> between the versions causing another problem related to jms.queue /
> jms.topic prefixing. While the NMS clients listen and artemis jms clients
> send to the same topics as described in the previous message, Artemis 2.5.0
> prefixes the addresses with jms.topic. While the messages are being sent to
> e.g.  A.B.f64dd592-a8fb-442e-826d-927834d566f4.C.D they are only received
> if I explicitly prefix the listening address with jms.topic, for example
> topic://jms.topic.A.B.*.C.D. Can this somehow be avoided in the broker
> configuration?
>
> Best regards
>
> -----Original Message-----
> From: Justin Bertram [mailto:jbert...@apache.org]
> Sent: 21. helmikuuta 2018 15:19
> To: users@activemq.apache.org
> Subject: Re: Artemis 2.4.0 - Issues with memory leaks and JMS message
> redistribution
>
> Your first issue is probably a misconfiguration.  Your cluster-connection
> is using an "address" value of '*' which I assume is supposed to mean "all
> addresses," but the "address" element doesn't support wildcards like this.
> Just leave it empty to match all addresses.  See the documentation [1] for
> more details.
>
> Even after you fix that configuration issue you may run into issues.
> These may be fixed already via ARTEMIS-1523 and/or ARTEMIS-1680.  If you
> have a reproducible test-case then you can verify using the head of the
> master branch.
>
> For the memory issue it would be helpful to have some heap dumps or
> something to actually see what's actually consuming the memory.  Better yet
> would be a reproducible test-case.  Do you have either?
>
>
> Justin
>
> [1] https://activemq.apache.org/artemis/docs/latest/clusters.html
>
>
>
> On Wed, Feb 21, 2018 at 5:39 AM, Ilkka Virolainen <
> ilkka.virolai...@bitwise.fi> wrote:
>
> > Hello,
> >
> > I am using Artemis 2.4.0 to broker messages through JMS queues/topics
> > between a set of clients. Some are Apache NMS 1.7.2 ActiveMQ clients
> > and others are using Artemis JMS client 1.5.4 included in Spring Boot
> 1.5.3.
> > Broker topology is a symmetric cluster of two live nodes with static
> > connectors, both nodes having been setup as replicating colocated
> > backup pairs with scale down. I have two quite frustrating issues at the
> moment:
> > message redistribution not working correctly and a memory leak causing
> > eventual thread death.
> >
> > ISSUE #1 - Message redistribution / load balancing not working:
> >
> > Client 1 (NMS) connects to broker a and starts listening, artemis
> > creates the following address:
> >
> > (Broker a):
> > A.B.*.C.D
> > |-queues
> > |-multicast
> >   |-f64dd592-a8fb-442e-826d-927834d566f4
> >
> > Server 1 (artemis-jms-client) connects to broker b and sends a message
> > to
> > topic: A.B.f64dd592-a8fb-442e-826d-927834d566f4.C.D - this should be
> > routed to broker a since the corresponding queue has no consumers on
> > broker b (the queue does not exist). This however does not happen and
> > the client receives no messages. Broker b has some other clients
> > connected, causing similar (but not the same) queues having been created:
> >
> > (Broker b):
> > A.B.*.C.D
> > |-queues
> > |-multicast
> >   |-1eb48079-7fd8-40e9-b822-bcc25695ced0
> >   |-9f295257-c352-4ae6-b74b-d5994f330485
> >
> >
> > ISSUE #2: - Memory leak and eventual thread death
> >
> > Artemis broker has 4GB allocated heap space and global-max-size is set
> > up as half of that (being the default setting). Address-full-policy is
> > set to PAGE for all addresses and some individual addresses have small
> > max-size-bytes values set e.g. 104857600. As far as I know the paging
> > settings should limit memory usage but what happens is that at times
> > Artemis uses the whole heap space, encounters an out of memory error
> > and
> > dies:
> >
> > 05:39:29,510 WARN  [org.eclipse.jetty.util.thread.QueuedThreadPool] :
> > java.lang.OutOfMemoryError: Java heap space
> > 05:39:16,646 WARN  [io.netty.channel.ChannelInitializer] Failed to
> > initialize a channel. Closing: [id: ...]: java.lang.OutOfMemoryError:
> > Java heap space
> > 05:41:05,597 WARN  [org.eclipse.jetty.util.thread.QueuedThreadPool]
> > Unexpected thread death: org.eclipse.jetty.util.thread.
> > QueuedThreadPool$2@5ffaba31 in qtp20111564{STARTED,8<=8<=200,i=2,q=0}
> >
> > Are these known issues in Artemis or misconfigurations in the brokers?
> >
> > The broker configurations are as follows. Broker b has an identical
> > configuration excluding that the cluster connector's connector-ref and
> > static-connector connector-ref refer to broker b and broker a
> respectively.
> >
> > Best regards,
> >
> > broker.xml (broker a):
> >
> > <?xml version='1.0'?>
> > <configuration xmlns="urn:activemq" xmlns:xsi="http://www.w3.org/
> > 2001/XMLSchema-instance" xsi:schemaLocation="urn:activemq
> > /schema/artemis-configuration.xsd">
> >     <core xmlns="urn:activemq:core" xmlns:xsi="http://www.w3.org/
> > 2001/XMLSchema-instance" xsi:schemaLocation="urn:activemq:core ">
> >         <name>[broker-a-ip]</name>
> >         <persistence-enabled>true</persistence-enabled>
> >
> >         <journal-type>NIO</journal-type>
> >
> >         <paging-directory>...</paging-directory>
> >         <bindings-directory>...</bindings-directory>
> >         <journal-directory>...</journal-directory>
> >         <large-messages-directory>...</large-messages-directory>
> >
> >         <journal-datasync>true</journal-datasync>
> >         <journal-min-files>2</journal-min-files>
> >         <journal-pool-files>-1</journal-pool-files>
> >         <journal-buffer-timeout>788000</journal-buffer-timeout>
> >         <disk-scan-period>5000</disk-scan-period>
> >
> >         <max-disk-usage>97</max-disk-usage>
> >
> >         <critical-analyzer>true</critical-analyzer>
> >         <critical-analyzer-timeout>120000</critical-analyzer-timeout>
> >         <critical-analyzer-check-period>60000</critical-
> > analyzer-check-period>
> >         <critical-analyzer-policy>HALT</critical-analyzer-policy>
> >
> >         <acceptors>
> >             <acceptor name="invm-acceptor">vm://0</acceptor>
> >             <acceptor name="artemis">tcp://0.0.0.0:61616</acceptor>
> >             <acceptor name="ssl">tcp://0.0.0.0:61617?sslEnabled=true;
> > keyStorePath=...;keyStorePassword=...</acceptor>
> >         </acceptors>
> >         <connectors>
> >             <connector name="invm-connector">vm://0</connector>
> >             <connector name="netty-connector">tcp://[
> > broker-a-ip]:61616</connector>
> >             <connector name="broker-b-connector">[
> > broker-b-ip]:61616</connector>
> >         </connectors>
> >
> >         <cluster-connections>
> >             <cluster-connection name="cluster-name">
> >                 <address>*</address>
> >                 <connector-ref>netty-connector</connector-ref>
> >                 <retry-interval>500</retry-interval>
> >                 <reconnect-attempts>5</reconnect-attempts>
> >                 <use-duplicate-detection>true</use-duplicate-detection>
> >                 <message-load-balancing>ON_DEMAND</message-load-
> balancing>
> >                 <max-hops>1</max-hops>
> >                 <static-connectors>
> >                     <connector-ref>broker-b-connector</connector-ref>
> >                 </static-connectors>
> >             </cluster-connection>
> >         </cluster-connections>
> >
> >         <ha-policy>
> >             <replication>
> >                 <colocated>
> >
> > <backup-request-retry-interval>5000</backup-request-
> > retry-interval>
> >                     <max-backups>3</max-backups>
> >                     <request-backup>true</request-backup>
> >                     <backup-port-offset>100</backup-port-offset>
> >                     <excludes>
> >                         <connector-ref>invm-connector</connector-ref>
> >                         <connector-ref>netty-connector</connector-ref>
> >                     </excludes>
> >                     <master>
> >                         <check-for-live-server>true</
> > check-for-live-server>
> >                     </master>
> >                     <slave>
> >                         <restart-backup>false</restart-backup>
> >                         <scale-down />
> >                     </slave>
> >                 </colocated>
> >             </replication>
> >         </ha-policy>
> >
> >         <cluster-user>ARTEMIS.CLUSTER.ADMIN.USER</cluster-user>
> >         <cluster-password>[the shared cluster
> > password]</cluster-password>
> >
> >         <security-settings>
> >             <security-setting match="#">
> >                 <permission type="createDurableQueue" roles="amq,
> > other-role" />
> >                 <permission type="deleteDurableQueue" roles="amq,
> > other-role" />
> >                 <permission type="createNonDurableQueue" roles="amq,
> > other-role"  />
> >                 <permission type="createAddress" roles="amq, other-role"
> />
> >                 <permission type="deleteNonDurableQueue" roles="amq,
> > other-role" />
> >                 <permission type="deleteAddress" roles="amq, other-role"
> />
> >                 <permission type="consume" roles="amq, other-role" />
> >                 <permission type="browse" roles="amq, other-role" />
> >                 <permission type="send" roles="amq, other-role" />
> >                 <permission type="manage" roles="amq" />
> >             </security-setting>
> >             <security-setting match="A.some.queue">
> >                 <permission type="createNonDurableQueue" roles="amq,
> > other-role" />
> >                 <permission type="deleteNonDurableQueue" roles="amq,
> > other-role" />
> >                 <permission type="createDurableQueue" roles="amq,
> > other-role" />
> >                 <permission type="deleteDurableQueue" roles="amq,
> > other-role" />
> >                 <permission type="createAddress" roles="amq, other-role"
> />
> >                 <permission type="deleteAddress" roles="amq, other-role"
> />
> >                 <permission type="consume" roles="amq, other-role" />
> >                 <permission type="browse" roles="amq, other-role" />
> >                 <permission type="send" roles="amq, other-role" />
> >             </security-setting>
> >                 <security-setting match="A.some.other.queue">
> >                 <permission type="createNonDurableQueue" roles="amq,
> > other-role" />
> >                 <permission type="deleteNonDurableQueue" roles="amq,
> > other-role" />
> >                 <permission type="createDurableQueue" roles="amq,
> > other-role" />
> >                 <permission type="deleteDurableQueue" roles="amq,
> > other-role" />
> >                 <permission type="createAddress" roles="amq, other-role"
> />
> >                 <permission type="deleteAddress" roles="amq, other-role"
> />
> >                 <permission type="consume" roles="amq, other-role" />
> >                 <permission type="browse" roles="amq, other-role" />
> >                 <permission type="send" roles="amq, other-role" />
> >             </security-setting>
> >             ...
> >             ... etc.
> >             ...
> >         </security-settings>
> >
> >         <address-settings>
> >             <address-setting match="activemq.management#">
> >                 <dead-letter-address>DLQ</dead-letter-address>
> >                 <expiry-address>ExpiryQueue</expiry-address>
> >                 <redelivery-delay>0</redelivery-delay>
> >                 <max-size-bytes>-1</max-size-bytes>
> >
> > <message-counter-history-day-limit>10</message-counter-
> > history-day-limit>
> >                 <address-full-policy>PAGE</address-full-policy>
> >             </address-setting>
> >             <!--default for catch all -->
> >             <address-setting match="#">
> >                 <dead-letter-address>DLQ</dead-letter-address>
> >                 <expiry-address>ExpiryQueue</expiry-address>
> >                 <redelivery-delay>0</redelivery-delay>
> >                 <max-size-bytes>-1</max-size-bytes>
> >
> > <message-counter-history-day-limit>10</message-counter-
> > history-day-limit>
> >                 <address-full-policy>PAGE</address-full-policy>
> >                 <redistribution-delay>1000</redistribution-delay>
> >             </address-setting>
> >             <address-setting match="DLQ">
> >                 <!-- 100 * 1024 * 1024 -> 100MB -->
> >                 <max-size-bytes>104857600</max-size-bytes>
> >                 <!-- 1000 * 60 * 60 -> 1h -->
> >                 <expiry-delay>3600000</expiry-delay>
> >                 <expiry-address />
> >             </address-setting>
> >             <address-setting match="A.some.queue">
> >                 <redelivery-delay-multiplier>1.0</redelivery-delay-
> > multiplier>
> >                 <redelivery-delay>0</redelivery-delay>
> >                 <max-redelivery-delay>10</max-redelivery-delay>
> >             </address-setting>
> >                 <address-setting match="A.some.other.queue">
> >                 <redelivery-delay-multiplier>1.0</redelivery-delay-
> > multiplier>
> >                 <redelivery-delay>0</redelivery-delay>
> >                 <max-redelivery-delay>10</max-redelivery-delay>
> >                 <max-delivery-attempts>1</max-delivery-attempts>
> >                 <max-size-bytes>104857600</max-size-bytes>
> >             </address-setting>
> >             ...
> >             ... etc.
> >             ...
> >         </address-settings>
> >
> >         <addresses>
> >             <address name="DLQ">
> >                 <anycast>
> >                     <queue name="DLQ" />
> >                 </anycast>
> >             </address>
> >             <address name="ExpiryQueue">
> >                 <anycast>
> >                     <queue name="ExpiryQueue" />
> >                 </anycast>
> >             </address>
> >             <address name="A.some.queue">
> >                 <anycast>
> >                     <queue name="A.some.queue">
> >                         <durable>true</durable>
> >                     </queue>
> >                 </anycast>
> >             </address>
> >             <address name="A.some.other.queue">
> >                 <anycast>
> >                     <queue name="A.some.other.queue">
> >                         <durable>true</durable>
> >                     </queue>
> >                 </anycast>
> >             </address>
> >             ...
> >             ... etc.
> >             ...
> >         </addresses>
> >     </core>
> > </configuration>
> >
>

Re: Artemis 2.4.0 - Issues with memory leaks and JMS message redistribution

Reply via email to