Hi all,
I have done some more test and now seems to me that the problem has
nothing to do with journal files but with journal files in cursor storage.
My previous thoughts were wrong as messages are committed into the
database every 5 minutes or as soon as the cursor memory reaches 70% of
broker max memory, so journal files can be reused (at least if total
ammount of messages before they are written don't oversize the journal
files size)
So, if the exception is not thrown because of the journal files, it
has to come from another place. I have played with memory configuration
and this are my conclusions at this moment:
First, my config. I am using one queue with 1 producer (Servlet
inside Tomcat with an ActiveMQ PooledConnectionFactory with 5 conns, so
there can be more than one thread producing) and 10 consumers (Spring
DefaultMessageListener). Consumers are slower than producer, so the
queue size increase during peak periods. Reliability is really important
for us and we don't want to lose a single message. We also don't want to
activate producer flow control, in case the slowness can exhaust Tomcat
thread pool and lose messages.
Then the test begins with JMeter massive post to producer (30
threads indefinitly). What I expected to happen is that everytime the
cursor memory size reaches 70% of total memory (5 Mb for test purposes
to have the result in a few seconds, not production config), messages
were persisted in database releasing the memory they were consuming, so
new messages can take their place.
The real result: when the cursor memory reaches 70% of memory,
ActiveMQ throws the mentioned exception:
27 abr 2009 12:53:07,734 ERROR [ActiveMQ Journal Checkpoint Worker]
org.apache.activemq.store.journal.JournalPersistenceAdapter - Failed to
mark the Journal:
org.apache.activeio.journal.InvalidRecordLocationException: The location
is less than the last mark.
org.apache.activeio.journal.InvalidRecordLocationException: The location
is less than the last mark.
at
org.apache.activeio.journal.active.JournalImpl.setMark(JournalImpl.java:340)
at
org.apache.activemq.store.journal.JournalPersistenceAdapter.doCheckpoint(JournalPersistenceAdapter.java:416)
at
org.apache.activemq.store.journal.JournalPersistenceAdapter$1.iterate(JournalPersistenceAdapter.java:121)
at
org.apache.activemq.thread.DedicatedTaskRunner.runTask(DedicatedTaskRunner.java:98)
at
org.apache.activemq.thread.DedicatedTaskRunner$1.run(DedicatedTaskRunner.java:36)
27 abr 2009 12:53:07,734 DEBUG [ActiveMQ Journal Checkpoint Worker]
org.apache.activemq.store.journal.JournalPersistenceAdapter -
Checkpoint done.
Then the memory stays between 70% and 80%. Despite the exception, no
messages are lost in their way to the consumers, but as the checkpoint
fails, no memory is released and the performace of consumers falls from
45 messages/seconds to 2-3 messages/seconds
Even more, if there were messages in the queue when I started the
consumer, the memory reached inmediatly 70% of max memory throwing the
exception. This gave a clue: prefetch size of consumer. ActiveMQ tries
to prefetch messages from the queue in database and inmediatly fills the
memory as its default value (1000) is greater than the memory can afford
(70% of 5Mb).
Then I set prefetch size to 100 messages. The memory, even with a
lot of messages already in the database when starting, is less than 70%
so everything is ok at startup. When I start Jmeter again, cursor memory
size again starts to increase and when it reaches 70% the same exception
is thrown.
Reading the docs about memory cursors, it says that "The default
message cursor type in ActiveMQ 5.0 is Store based" and "ActiveMQ 5.0.0,
there is a new memory model that allows messages to be paged in from
storage when space is available (using Store cursors for persistent
messages).".
Now it makes sense de exception, as there is a problem storing
memory cursors. I suppose they are stored in same Journal Store that is
used for storing messages and transactions with journaledJDBC
persistence adapter.
The final test is changing message cursor type to file queue cursor
<pendingQueuePolicy>
<fileQueueCursor />
</pendingQueuePolicy>
Now when memory cursor reaches 70% of memory data is written to
"data\localhost\tmp_storage", and the cursor memory is released so
performance is not affected.
My questions now are:
Are my suppositions ok?
Is this a bug? Should I open a JIRA issue?
my options now is to activate producer flow control for the queue so
cursor memory limits are not reached or use file queue cursor. I prefer
not to use producer flow control. File queue cursor is using Kaha
storage, is this ready for production enviroment where reliability is
its main requirement?
If you need some more info of my config feel free to ask.
Thank you very much for your help
Diego
Diego Rodríguez Martín escribió:
Hi again,
I'm really stuck on this as I need a response to set my production
config.
Is there any extra data I can give you to narrow the problem?
Many thanks
Diego
Diego Rodríguez Martín escribió:
Hi all,
I am planning to use ActiveMQ 5.2.0 in a project and I have made
some tests in a windows box to understand how the journal files work.
I am defining a Journaled JDBC with postgres for persistence this
way.
<persistenceAdapter>
<journaledJDBC journalLogFiles="5"
dataDirectory="${activemq.base}/data" dataSource="#postgres-ds"/>
</persistenceAdapter>
Reading the docs, it seems that the journal files are used for
transaction log, and you can define how much files do you want and
the size of the files. ActiveMQ does not create new files, so it
reuses the old files. This is OK, but I was wondering what happens if
the queues (I am going to use only 2 queues) size increase long
enough to use all the files in the journal and need more space. It is
not a weird scenario, as we are decoupling consumer and producer and
consumer can sometines be offline due to maintenance routines. The
result is that journal files get overwritten, and many transactions
has been lost. When I start the consumer, it begins to consume
messages, but the JournalPersistenceAdapater fails everytime it has
to execute the checkpoint code:
2009-04-16 18:48:03,140 [eckpoint Worker] ERROR
JournalPersistenceAdapter - Failed to mark the Journal:
org.apache.activeio.journal.InvalidRecordLocationException: The
location is less than the last mark.
org.apache.activeio.journal.InvalidRecordLocationException: The
location is less than the last mark.
at
org.apache.activeio.journal.active.JournalImpl.setMark(JournalImpl.java:340)
at
org.apache.activemq.store.journal.JournalPersistenceAdapter.doCheckpoint(JournalPersistenceAdapter.java:416)
at
org.apache.activemq.store.journal.JournalPersistenceAdapter$1.iterate(JournalPersistenceAdapter.java:121)
at
org.apache.activemq.thread.DedicatedTaskRunner.runTask(DedicatedTaskRunner.java:98)
at
org.apache.activemq.thread.DedicatedTaskRunner$1.run(DedicatedTaskRunner.java:36)
I think the files are corrupted, can you confirm this?
Is the exception related to the problem described?
It is an expected behaviour? Should the maximum number of messages in
a queue be considered when configuring the journal?
Thank you very much for your help.
--
-------------------------------------------------------------
Diego Rodríguez Martín (drodrig...@altiria.com)
ALTIRIA TIC - Servicios SMS - Desarrollo Web
http://www.altiria.com
-------------------------------------------------------------