Slow throughput after several hundred messages
Hi, I am using ActiveMQ 5.4.1. Our application runs on GlassFish application server 2.1.1, however I have reproduced the slow behaviour described below using a J2SE standalone client as well. Our application uses multiple stateless session beans to concurrently produce messages which are stored in persistent queues, and consumed using multiple stateless session beans (with separate connections and queue sessions). We have found this approach with ActiveMQ works fine and is performant up until several hundred messages are read/consumed from a single persistent queue. In this case, consumers/producers seem to be throttled regardless of broker/destination configuration, to the point where, after 24hours, the broker becomes totally unresponsive and requires a restart to begin processing once again. The problem seems to be with a single persistent queue mounting up many reads/writes over time as I have been able to reproduce this with a single producer and a single consumer thread both reading/writing to the same queue. I have been trying various different configurations to get around this: using the default store cursor, vm cursor and file based cursors; removing producer flow control for queue destinations; embedded vs remote brokers. None have yielded any change in behaviour. I also get a large number of "EOFException" logged from the server as shown below: INFO | Transport failed: java.io.EOFException java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) at org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:226) at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:218) at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:201) at java.lang.Thread.run(Thread.java:619) I saw some posts where this is described as a symptom of unexpected loss of connection, made more prevalent with use of the scheduler service. I have tried removing the scheduler service, but this has not helped either. Has anyone encountered this behaviour before or have any idea what I could try to solve it? Thanks in advance for any help you might be able to give. Regards, Rob. -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3082431.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
I have just downloaded version 5.4.2 and run the same standalone j2se test case I developed to highlight this issue against this version, but I am still experiencing the same behaviour as with 5.4.1. -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3085333.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
I have attached a compressed rar file containing the broker debug trace: http://activemq.2283324.n4.nabble.com/file/n3085354/logs.rar logs.rar Due to attachment file size limits, I will attempt in separate posts to add the broker configuration and crude netbeans project I used to show this behaviour as well as some heap dumps of the jvm before, during and after running the test project. Again the only obvious exception being logged is this EOFException: 2010-12-13 12:03:07,203 | DEBUG | Transport failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///10.113.32.88:2428 java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:269) at org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:227) at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:219) at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:202) at java.lang.Thread.run(Thread.java:619) -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3085354.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
Here is the sample netbeans project that I have used to highlight the problem: http://activemq.2283324.n4.nabble.com/file/n3085358/StandaloneJMSWriteConsume.rar StandaloneJMSWriteConsume.rar It is very crude, simply run the ConcurrentJMSWriteConsumeJ2SETest class and it will continue to write and read (in an infinite loop) from a single queue on a local broker with a single consumer thread and a single producer thread. After a couple of minutes I find the throttling behaviour described in the orriginal post. -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3085358.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
The activemq heap dump during execution of the sample test case: http://activemq.2283324.n4.nabble.com/file/n3085361/activemq-heapdump-during-test-after-5-mins.rar activemq-heapdump-during-test-after-5-mins.rar -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3085361.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
Heapdump following killing the test application and performing GC: http://activemq.2283324.n4.nabble.com/file/n3085363/activemq-heapdump-after-test.rar activemq-heapdump-after-test.rar -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3085363.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
Also, here is the broker configuration I used: http://activemq.2283324.n4.nabble.com/file/n3085451/activemq.xml activemq.xml -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3085451.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
I have tried each of the different cursor types with 5.4.1, but since downloading the 5.4.2 version, I have not changed the cursor used, so this is the default configuration, where no cursor is defined for queues - this should result in the default store cursor being used. I have just tried with enableJournalDiskSyncs = "false", but this results in similar slow processing after a batch of messages have been processed. Although with this approach the consumer thread seemed to be more hindered than the producer, as messages were being produced onto the test queue around 3-4 times faster than they were being consumed. Once again I get the same EOFException thrown and message production/consumption seems to slow periodically, for greater periods the more messages I process. -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3085521.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
I have left the test application running over night, it is still consuming and producing messages but is blocking very often now (every 15-20 seconds for around 10 seconds each time). I used visualVM to analyse the garbage collection, which seems to be running around once every two minutes or so and taking around a second to complete. The JVM heap is around 60% of the max heap size. I have attached heap dumps from previous runs in earlier threads. I have taken a heap dump of the broker, still under load after 17hrs uptime, but the compressed file is too large to upload onto this forum. I checked the heapdump for LRUCache, but this has 9 live instances, unlike in the issue described in AMQ-3028. I will now bounce the server and turn on garbage collection logs as well. Regards, Rob. -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3086779.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
Here is the activemq logs from a test run where I allowed it to begin processing, block once and waited for it to resume processing, where I have turned on verbose garbage collection output: http://activemq.2283324.n4.nabble.com/file/n3086805/logs_with_verbose_gc.rar logs_with_verbose_gc.rar -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3086805.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
In order to try to work out just what is happening during these blocking periods, I have used visual vm to perform cpu profiling, take a heapdump and a thread dump when the broker becomes unresponsive. Here are the results: http://activemq.2283324.n4.nabble.com/file/n3086836/visual_vm_analysis_during_unresponsive_phase.rar visual_vm_analysis_during_unresponsive_phase.rar -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3086836.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
Here is the gc.log and activemq logs, once again taken from a run where I allowed it to process, block and then continue processing: http://activemq.2283324.n4.nabble.com/file/n3087092/logs_verbose_gc.rar logs_verbose_gc.rar -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3087092.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
I believe the cause of the EOFException and the blocking behaviour and slowdown is caused by creating and closing connections for each read/write. In the standalone j2se application I attached in an earlier post, each thread creates a new connection for each request before closing down the resources once it has written to or consumed from the queue. This produces lots of EOFExceptions and the blocking/slowdown behaviour I outlined earlier. If I modify the class to ensure that a single connection is used for all requests from the producer thread and a single connection is shared for all the requests from the consumer thread, this stops the slowdown and the EOFExceptions are no longer thrown. Unfortunately, this is not a viable approach for us, as our application uses stateless session beans, which may be persisted by the container, therefore a new connection to the broker is created and closed once used for each request. My question now is, what could be causing this delay/problem with opening and closing lots of connections to the broker? Does ActiveMQ maintain a connection pool internally - could it be maintenance of this pool causing the delay? Is there some configuration that can be made to the broker which may impact the behaviour of the connection handling? -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3089010.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
Thanks Reynald, that utility could be useful. I think I have already pinned down this problem to be opening and closing of connections to the broker. If I share a single or a couple of connections for all processing, I do not get the broker slowdown issue. Unfortunately, our application runs in GlassFish and without closing down the connections explicitly in the code after each request, the container cannot keep up with connection pool management and the pool is quickly exhausted. I still do not understand exactly what is causing the decline in broker performance over time when opening and closing connections to the broker for multiple concurrent requests. -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3090649.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
I am using a prefetch of 0. This has been configured in the ra.xml in the activemq resource adapter we use to connect up to glassfish. I configured this by adding "&jms.prefetchPolicy.all=0" to the connection url. Just for good measure I also added the following into the ra.xml as well: The pre-fetch value for queues QueuePrefetch java.lang.Integer 0 -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3090741.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
here is the ra.xml I use: http://activemq.2283324.n4.nabble.com/file/n3090748/ra.xml ra.xml -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3090748.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
I took a look at the XAPooledConnectionFactory, but I am struggling to understand how I could use this when connecting to the broker from a glassfish container via the activemq jca resource adapter. Is this configurable within the resource adapter or jca connection pool somehow? -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p3090968.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.
Re: Slow throughput after several hundred messages
Just to note that I was able to resolve this issue by changing our code to maintain a single connection for the multiple concurrent processing threads. The slow down seems to be caused by continuous opening and closing of queue connections to ActiveMQ. -- View this message in context: http://activemq.2283324.n4.nabble.com/Slow-throughput-after-several-hundred-messages-tp3082431p364.html Sent from the ActiveMQ - User mailing list archive at Nabble.com.