Hi Tim,

Thank you for your input and sharing your experience and knowledge.


tbain98 wrote
> 1.  In my limited experience with slow consumer abort strategies (using
> the
> SlowConsumerAbortStrategy, not the SlowAckConsumerAbortStrategy), I've
> observed that a client will continue processing the current message even
> when aborted; the abort seems to allow the broker to get on with its life
> but doesn't seem to stop the client from finishing what it's doing.  If
> that's what you mean by "AbortSlowAckConsumerStrategy couldn't abort the
> consumer", then that's in line with what I've observed.  Maybe someone who
> knows the ActiveMQ client code more intimately will know of a way to
> interrupt the processing that the client is doing, but if not, you might
> need to build a max processing time into your client's message-handling
> logic, to allow your client to stop if it takes too long.


First I need to clarify some things:
- We have defined a transactionTimeout (30 minutes) on a database, so if a
listener can’t consume the message in less than 30 minutes the exception
will be thrown and listener will run redelivery policy rules. (In this case
it will be send back to the broker and broker will schedule one more
redelivery after 100 seconds, if the message couldn’t be processed again
broker will send it to the DLQ). The whole processing can take about 30-35
minutes max.

- About the logs: I have posted the logs somehow selectively - I wanted to
show that the same idle consumer can’t be aborted in a span of 18 hours.
Lines like this one:


2014-10-25 00:00:11,455 [host] Scheduler] INFO  AbortSlowConsumerStrategy     
- aborting slow consumer:
ID:min-p-app02.osl.basefarm.net-36433-1414153506788-1:1:17:7 for
destination:queue://generateReportQueue 

we get every 6 minutes (or so) with the same consumer id all the time. I
have cut the most of logs to keep it short. There were no messages to
consume at that point in the queue. The real issue is that
SlowAckConsumerAbortStrategy couldn’t abort the consumer which was idle. My
experience with the abort strategy (when it’s working correctly) is the same
as yours: it doesn’t abort the consumer but politely asks to abort when it
finished processing current message. But in this case the consumer didn’t
have anything to process (maybe it only had some acks to send back - as I
was using JMX to move messages to a DLQ by mylself).




tbain98 wrote
> 2.  Your config seems reasonable for your use-case, though slow consumer
> abort strategies are generally intended for when a consumer unexpectedly
> takes a long time, whereas your use case seems like your consumers
> expectedly but unpredictably take a long time.  But certainly you're using
> the more appropriate of the two strategies if you're going to use one. 

As I mentioned earlier max processing time of one message is about 30
minutes. I agree with you that with so few messages in the queue it’s
probably better to use check acks and not rely on prefetch buffer.


tbain98 wrote
> 3.  How does queue processing "stop"?  Do you just mean that once both
> consumers start working on large messages, they're not available to work
> on
> small messages? 

I mean when consumer “stops processing”, none of messages in the queue are
being consumed at all (both: small and bigger ones) - they stay indefinitely
in the queue (until the whole application is restarted). It happens as well
for consumers on both nodes (1 consumer per node).


tbain98 wrote
> 4.  I'm concerned that by allowing one redelivery of each message, you're
> setting up a situation where you could tie up both of your consumers (one
> processing the first delivery, one processing the second for the same
> message); is message re-delivery something you have to have? 

That could be the case. I can try to verify if in this case I really need
redelivery, but from what I remember in two attempts the reports are
generated in most of the cases, with only one attempt the percentage is
smaller, which requires more manual attention ...



tbain98 wrote
> 6.  One thing you might consider is having your client spin off the work
> of
> processing a message into a separate thread, and then returning
> (successfully) after either the thread finishes or some timeout elapses,
> whichever happens first.  Then when a large message comes in, it will run
> in the background till it finishes, but it won't prevent the consumer from
> continuing on without it and it won't cause the broker to redeliver the
> message to the other consumer and tie up processing.  Obviously your
> processing algorithm will need to be thread-safe for this to work, but it
> might give you options without even needing to worry about the
> SlowConsumerAbortStrategy...  Also, if you've got an algorithm that
> usually
> takes under 10 minutes and sometimes takes 18 hours (based on your logs
> from before you restarted Tomcat), you might want to improve your
> algorithm, to either speed up the work you're currently doing or find a
> way
> to get your answer with less processing (e.g. by only sampling some of
> your
> data).  This is obviously very specific to whatever domain you're working
> in and might not be easy to do, but 18 hours to process a message
> definitely makes my Spidey senses tingle... 

If I understand you correctly I think we can’t use this approach (did I?).
The whole point of employing JMS for us was to have async processing with
guarantees. In our system we could have many bugfix releases throughout the
day, and if that would happen and the report wasn’t generated before the
restart of the application we would lose the message. I am trying to find a
config which will work for us most often automatically and only for certain
problems require manual developer attention.

Once again, thank you for input.

Regards
Marek




--
View this message in context: 
http://activemq.2283324.n4.nabble.com/Not-abortable-slow-consumers-stopped-processing-of-messages-in-a-queue-tp4686721p4686741.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.

Reply via email to