Responses inline.

On Fri, Oct 20, 2017 at 5:46 AM, Lionel van den Berg <lion...@gmail.com>
wrote:

> Hi, thanks for the response.
>
> Some questions on these points from the troubleshooting.
>
>
>    1. *It contains a pending message for a destination or durable topic
>    subscription*
>
> This seems a little flawed, if a consumer who I have little control of is
> mis-behaving then my ActiveMQ can end up shutting down and unrecoverable.
> Is there some way of timing this out or similar?
>

There are multiple ways of discarding messages that are not being consumed,
which are detailed at http://activemq.apache.org/slow-consumer-handling.html
(several of which it sounds like you're already using). Keep in mind that
unconsumed DLQ messages are unconsumed messages, so you'll want to make
sure you address those messages as well;
http://activemq.apache.org/message-redelivery-and-dlq-handling.html
contains additional information about handling messages in the context of
the DLQ. And no, I wouldn't say it's flawed, it just means you have to do
some configuration work that you haven't yet done.


> *2. It contains an ack for a message which is in an in-use data file - the
> ack cannot be removed as a recovery would then mark the message for
> redelivery*
>
> Same comment as 1.
>

Same response as for #1. There's one additional wrinkle (KahaDB keeps an
entire data file alive because of a single live message, which in turn
means you have to keep the acks for the later messages which are in later
data files), but that's been partially mitigated by the addition of the
ability to compact acks by replaying them into the current data file, which
should allow any data file that contains no live non-ack messages to be
GC'ed. So there's a small portion of this that's purely the result of
KahaDB's design as a non-compacting data store, but it's a problem only
when there's an old unacknowledged message, which takes us back to #1.


> *3. The journal references a pending transaction*
>
> I'm not using transactions, but are there transactions under the hood?
>

No, this would only apply if you were directly using transactions, so this
doesn't apply to you.


> *4. It is a journal file, and there may be a pending write to it*
>
> Why would this be the case?
>

If we haven't finished flushing the file, using a buffer-then-flush
paradigm. This will be an infrequent situation, and should only be a small
number of data files, so if you're having a problem with the number of
files kept, it's not because of this. It's just included in the list for
completeness.

I'll see if I can change the logging settings, since the first occurrence
> the number of log files does not seem to have been an issue. I have it
> configured to keep messages for 7 days so regardless of the above
> conditions I would have thought that at that expiry the log would be
> cleaned up so we don't end up in such a situation where the system stops
> and cannot restart.
>

If you are indeed configured as you describe, I would think that log
cleanup would indeed happen as you expect, which means that either there's
an undiscovered bug in our code or you're not configured the way you think
you are.

The page I linked to originally has instructions for how to determine which
destinations have messages that are preventing the KahaDB data files from
being deleted, which might let you investigate further (for example, by
looking at the messages and their attributes to see if timestamps are being
set correctly).

Tim

Reply via email to