Re: Several hundred messages occupies almost 30G storage

Rural Hunter Sun, 01 Mar 2015 17:36:49 -0800

Option 3 is the best fit for my use case. It's a developing system so I
need someone to fix the code or another system if it fails. I need to
reporocess those failed messages to verify the fix. I considered to store
those messages in db earlier but the db storing process could be one of the
original failure causes. So it's still not a reliable option. For now, I
think I will process those messages in a much more timly manner if there is
no other better choice.


2015-03-02 0:33 GMT+08:00 Tim Bain <tb...@alumni.duke.edu>:

> So what do you plan to do with those failed messages?  I see three
> possibilities, but have no idea which of them is the one (or ones) that
> apply to your situation.
>
>    1. Are you going to reprocess them as-is because whatever failed is
>    assumed to just be some transient failure that won't fail the second
> time
>    you try it?  If so, why not just do that automatically a few minutes
> later
>    so you can delete the message when you're done?  Why wait days?
>    2. Are you going to ignore the message, but debug what it was about that
>    message that caused the failure so you can fix your code so it doesn't
>    error out the next time?  If so, log whatever you need (possibly the
> full
>    message text, if that's what you need) to a file that you can go back to
>    whenever you get around to it, and delete the message immediately.
>    3. Are you going to reprocess the message, but only after a person
>    manually fixes something that caused the failure the first time around.
>    This is the hardest one, where you actually care about the message
> itself
>    but you can't do anything about it till a person (who's only working
>    certain hours on certain days) can do something first.  Here you really
>    need to put the message into a database (and ActiveMQ is not a
> database) to
>    be held until it's ready to be acted upon (e.g. by making a message
> that's
>    equivalent to the one that failed and resending it).  You could say that
>    storing your original messages in a queue that's backed by a different
>    KahaDB meets this stored-in-a-database requirement, but databases allow
>    random access (so you could reprocess any message first, then any other
> one
>    second, whenever someone has fixed whatever was wrong with each one in
> any
>    order) and queues do not, so I don't think it's a very good solution.
>    Better to store the relevant content of the message in a real database,
> fix
>    the problem, and then generate a new message from it that you inject
> into
>    the real queue for processing as usual.
>
> Your goal in all cases should be for messages to get processed quickly
> (whether that means seconds, minutes, or maybe hours), whether that means
> "acted upon as intended", "discarded to the DLQ because I don't care about
> them if they fail", or "stored somewhere to be dealt with later because I
> actually care about the ones that fail".
>
> Tim
>
> On Sun, Mar 1, 2015 at 8:23 AM, Rural Hunter <ruralhun...@gmail.com>
> wrote:
>
> > I know it is the best way that I consume the messages timely for this
> > queue. It's just the queue I cares about the failed messages so I could
> > processthem timely. What about other queues if I don't care much about
> the
> > failed messages and they go to the real dead letter queues in the same
> > manner?
> >
> > 2015-03-01 0:37 GMT+08:00 Tim Bain <tb...@alumni.duke.edu>:
> >
> > > KahaDB can only delete a data file when every message in it can be
> > > deleted.  If a single message is still needed, the whole file must be
> > kept,
> > > and it doesn't perform any sort of compaction.  And if the last message
> > in
> > > a file that must be kept (because of some other message) has an ack in
> > the
> > > next file, that next file must be kept itself.  This can theoretically
> > > repeat forever if the messages happen to lay out just right in the data
> > > files, so a single old unprocessed message can theoretically prevent
> > KahaDB
> > > from deleting any of its data files.  There was a recently-fixed bug
> > where
> > > the file with the ack was being improperly deleted, resulting in
> > redelivery
> > > of the acked messages on broker restart; see
> > > https://issues.apache.org/jira/browse/AMQ-5542, which is fixed in
> 5.11.1
> > > and 5.12.0.  So the version you're running won't recognize the chain of
> > > files (if any) that need to be kept; with that fix, I'd expect you to
> hit
> > > your limit even faster.
> > >
> > > So your DLQish messages are in fact keeping alive any data files in
> which
> > > they exist.  If they all came in as a batch, that would be just one
> file,
> > > but since they're spread out over time, that's probably a decent number
> > of
> > > files.
> > >
> > > So you could do as Tim suggested and make a separate KahaDB store for
> > these
> > > long-lived messages; that would solve this problem, but it's
> ultimately a
> > > workaround.  Shrinking the size of each data file would help right now,
> > but
> > > once you upgrade to 5.11.1 or 5.12.0 it wouldn't be able to guarantee
> > that
> > > you didn't have to keep all the files; I'd focus on other options.
> > >
> > > So the real question is, why are you keeping your DLQ-like messages
> for 5
> > > days?  (This is probably the point where Art will chime in with
> "ActiveMQ
> > > is not a database.")  You should be doing something with those messages
> > > quickly, not keeping them around for ages.  If the messages get
> consumed
> > > immediately, the KahaDB files won't stick around long, and your problem
> > is
> > > solved.  So figure out how to change your application logic so you
> don't
> > > rely on messages staying on the broker for days; anything else is just
> a
> > > workaround for this flaw in your application logic.
> > >
> > > Tim
> > > One more question: will the same thing happen if I switch to leveldb?
> > >
> > > 2015-02-28 22:53 GMT+08:00 Rural Hunter <ruralhun...@gmail.com>:
> > >
> > > > I'm sorry I made a mistake. My storage is kahadb. We switched from
> > > leveldb
> > > > to kahadb a while ago and I forgot that.
> > > > Thanks for the links. Now understand what happened!
> > > >
> > > > 2015-02-28 19:03 GMT+08:00 Tim Robbins <tim.robb...@outlook.com>:
> > > >
> > > >> Hi,
> > > >>
> > > >> Two suggestions for you:
> > > >>
> > > >> 1. Try decreasing the logSize parameter for LevelDB. You’ve have a
> > > >> greater number of smaller log files, and a greater chance of each
> log
> > > file
> > > >> being garbage-collected.
> > > >> 2. With KahaDB, it’s possible to configure multiple KahaDB stores,
> and
> > > to
> > > >> put your dead-letter type messages into a different store than
> > > everything
> > > >> else to reduce overhead:
> > > >>
> > >
> >
> http://blog.garytully.com/2011/11/activemq-multiple-kahadb-instances.html
> > > >> <
> > > >>
> > >
> >
> http://blog.garytully.com/2011/11/activemq-multiple-kahadb-instances.html
> > > >> >
> > > >> Unfortunately it doesn’t appear that this applies to LevelDB yet!
> > > >>
> > > >> Regards,
> > > >>
> > > >> Tim
> > > >>
> > > >> > On 28 Feb 2015, at 7:27 pm, Rural Hunter <ruralhun...@gmail.com>
> > > wrote:
> > > >> >
> > > >> > Hi,
> > > >> >
> > > >> > Activemq version 5.10.2, storage: leveldb.
> > > >> >
> > > >> > I have a queue which serves similiar function as dead letter
> queue.
> > My
> > > >> > application process messages from another queue and if the
> > processing
> > > >> > fails, it put the message into this queue. The messages are
> > persistent
> > > >> and
> > > >> > average several KBs in size. My application processes many
> messages
> > > but
> > > >> the
> > > >> > failed message count is very small, less than 100 a day. I noticed
> > > after
> > > >> > the application running for several days, my activemq storage
> > becomes
> > > >> > almost full. I configured the storage to 30G. I checked the normal
> > > >> queues
> > > >> > and topics and there is no queue with large count of message. Most
> > of
> > > >> them
> > > >> > are empty and some have only several messages. Only the failure
> > > message
> > > >> > queue I metioned above has a few hundred messages(about 500) which
> > are
> > > >> > accumulated in several days.
> > > >> >
> > > >> > I have no idea what takes so much storage. I checked the storage
> > files
> > > >> and
> > > >> > found there are many db-xxxx.log with timestamp almost through the
> > > >> several
> > > >> > days. They are not consequent though. Some of the db-xxx.log files
> > are
> > > >> not
> > > >> > there. So the file list is like this:
> > > >> > db-1000.log
> > > >> > db-1001.log
> > > >> > db-1003.log
> > > >> > db-1004.log
> > > >> > db-1005.log
> > > >> > db-1008.log
> > > >> > ...
> > > >> > I suspect the failed messages are in those db-xxx.log files so I
> > just
> > > >> tried
> > > >> > to clear the failed message queue. Right after that I found those
> > old
> > > >> > db-xxx.log disappeared and the storage usage went back to 2%. So
> it
> > > >> seems
> > > >> > clear that the about 500 failed messages took around 30G storage.
> > But
> > > >> how
> > > >> > can it be? Those messages are very small in size and the total
> size
> > of
> > > >> the
> > > >> > messsage should be no more than a few MBs.
> > > >>
> > > >>
> > > >
> > >
> >
>

Re: Several hundred messages occupies almost 30G storage

Reply via email to