Hi all,

I have done some more test and now seems to me that the problem has nothing to do with journal files but with journal files in cursor storage.

My previous thoughts were wrong as messages are committed into the database every 5 minutes or as soon as the cursor memory reaches 70% of broker max memory, so journal files can be reused (at least if total ammount of messages before they are written don't oversize the journal files size)

So, if the exception is not thrown because of the journal files, it has to come from another place. I have played with memory configuration and this are my conclusions at this moment:

First, my config. I am using one queue with 1 producer (Servlet inside Tomcat with an ActiveMQ PooledConnectionFactory with 5 conns, so there can be more than one thread producing) and 10 consumers (Spring DefaultMessageListener). Consumers are slower than producer, so the queue size increase during peak periods. Reliability is really important for us and we don't want to lose a single message. We also don't want to activate producer flow control, in case the slowness can exhaust Tomcat thread pool and lose messages.

Then the test begins with JMeter massive post to producer (30 threads indefinitly). What I expected to happen is that everytime the cursor memory size reaches 70% of total memory (5 Mb for test purposes to have the result in a few seconds, not production config), messages were persisted in database releasing the memory they were consuming, so new messages can take their place.

The real result: when the cursor memory reaches 70% of memory, ActiveMQ throws the mentioned exception: 27 abr 2009 12:53:07,734 ERROR [ActiveMQ Journal Checkpoint Worker] org.apache.activemq.store.journal.JournalPersistenceAdapter - Failed to mark the Journal: org.apache.activeio.journal.InvalidRecordLocationException: The location is less than the last mark. org.apache.activeio.journal.InvalidRecordLocationException: The location is less than the last mark. at org.apache.activeio.journal.active.JournalImpl.setMark(JournalImpl.java:340) at org.apache.activemq.store.journal.JournalPersistenceAdapter.doCheckpoint(JournalPersistenceAdapter.java:416) at org.apache.activemq.store.journal.JournalPersistenceAdapter$1.iterate(JournalPersistenceAdapter.java:121) at org.apache.activemq.thread.DedicatedTaskRunner.runTask(DedicatedTaskRunner.java:98) at org.apache.activemq.thread.DedicatedTaskRunner$1.run(DedicatedTaskRunner.java:36) 27 abr 2009 12:53:07,734 DEBUG [ActiveMQ Journal Checkpoint Worker] org.apache.activemq.store.journal.JournalPersistenceAdapter - Checkpoint done.

Then the memory stays between 70% and 80%. Despite the exception, no messages are lost in their way to the consumers, but as the checkpoint fails, no memory is released and the performace of consumers falls from 45 messages/seconds to 2-3 messages/seconds

Even more, if there were messages in the queue when I started the consumer, the memory reached inmediatly 70% of max memory throwing the exception. This gave a clue: prefetch size of consumer. ActiveMQ tries to prefetch messages from the queue in database and inmediatly fills the memory as its default value (1000) is greater than the memory can afford (70% of 5Mb).

Then I set prefetch size to 100 messages. The memory, even with a lot of messages already in the database when starting, is less than 70% so everything is ok at startup. When I start Jmeter again, cursor memory size again starts to increase and when it reaches 70% the same exception is thrown. Reading the docs about memory cursors, it says that "The default message cursor type in ActiveMQ 5.0 is Store based" and "ActiveMQ 5.0.0, there is a new memory model that allows messages to be paged in from storage when space is available (using Store cursors for persistent messages).". Now it makes sense de exception, as there is a problem storing memory cursors. I suppose they are stored in same Journal Store that is used for storing messages and transactions with journaledJDBC persistence adapter. The final test is changing message cursor type to file queue cursor <pendingQueuePolicy>
       <fileQueueCursor />
   </pendingQueuePolicy>

Now when memory cursor reaches 70% of memory data is written to "data\localhost\tmp_storage", and the cursor memory is released so performance is not affected. My questions now are: Are my suppositions ok?

  Is this a bug? Should I open a JIRA issue?
my options now is to activate producer flow control for the queue so cursor memory limits are not reached or use file queue cursor. I prefer not to use producer flow control. File queue cursor is using Kaha storage, is this ready for production enviroment where reliability is its main requirement? If you need some more info of my config feel free to ask.

  Thank you very much for your help
Diego


Diego Rodríguez Martín escribió:
Hi again,

I'm really stuck on this as I need a response to set my production config.

   Is there any extra data I can give you to narrow the problem?
     Many thanks

      Diego

Diego Rodríguez Martín escribió:
Hi all,

I am planning to use ActiveMQ 5.2.0 in a project and I have made some tests in a windows box to understand how the journal files work.

I am defining a Journaled JDBC with postgres for persistence this way.

       <persistenceAdapter>
<journaledJDBC journalLogFiles="5" dataDirectory="${activemq.base}/data" dataSource="#postgres-ds"/>
       </persistenceAdapter>

Reading the docs, it seems that the journal files are used for transaction log, and you can define how much files do you want and the size of the files. ActiveMQ does not create new files, so it reuses the old files. This is OK, but I was wondering what happens if the queues (I am going to use only 2 queues) size increase long enough to use all the files in the journal and need more space. It is not a weird scenario, as we are decoupling consumer and producer and consumer can sometines be offline due to maintenance routines. The result is that journal files get overwritten, and many transactions has been lost. When I start the consumer, it begins to consume messages, but the JournalPersistenceAdapater fails everytime it has to execute the checkpoint code:

2009-04-16 18:48:03,140 [eckpoint Worker] ERROR JournalPersistenceAdapter - Failed to mark the Journal: org.apache.activeio.journal.InvalidRecordLocationException: The location is less than the last mark. org.apache.activeio.journal.InvalidRecordLocationException: The location is less than the last mark. at org.apache.activeio.journal.active.JournalImpl.setMark(JournalImpl.java:340) at org.apache.activemq.store.journal.JournalPersistenceAdapter.doCheckpoint(JournalPersistenceAdapter.java:416) at org.apache.activemq.store.journal.JournalPersistenceAdapter$1.iterate(JournalPersistenceAdapter.java:121) at org.apache.activemq.thread.DedicatedTaskRunner.runTask(DedicatedTaskRunner.java:98) at org.apache.activemq.thread.DedicatedTaskRunner$1.run(DedicatedTaskRunner.java:36)


I think the files are corrupted, can you confirm this?
Is the exception related to the problem described?
It is an expected behaviour? Should the maximum number of messages in a queue be considered when configuring the journal?

Thank you very much for your help.



--
-------------------------------------------------------------
Diego Rodríguez Martín (drodrig...@altiria.com)
ALTIRIA TIC - Servicios SMS - Desarrollo Web
http://www.altiria.com
-------------------------------------------------------------

Reply via email to