Re: Dealing with the "over-prefetch" problem with large numbers of workers and many queue servers

Martin Lichtin Thu, 22 Oct 2015 12:59:53 -0700

Your problem sounds a bit more complex, but just wanted to mentioned that one 
can set usePrefetchExtension=”false”.
From the docs:


The default behavior of a broker is to use delivery acknowledgements to 
determine the state of a consumer's prefetch buffer. For example, if a 
consumer's prefetch limit is configured as 1 the broker will dispatch 1 message 
to the consumer and when the consumer acknowledges receiving the message, the 
broker will dispatch a second message. If the initial message takes a long time 
to process, the message sitting in the prefetch buffer cannot be processed by a 
faster consumer.

If the behavior is causing issues, it can be changed such that the broker will 
wait for the consumer to acknowledge that the message is processed before 
refilling the prefetch buffer. This is accomplished by setting a destination 
policy on the broker to disable the prefetch extension for specific 
destinations.

- Martin


On 20.10.2015 04:15, Kevin Burton wrote:

We have a problem whereby we have a LARGE number of workers.  Right now
about 50k worker threads on about 45 bare metal boxes.

We have about 10 ActiveMQ servers / daemons which service these workers.

The problem is that my current design has a session per queue server per
thread.   So this means I have about 500k sessions each trying to prefetch
1 message at a time.

Since my tasks can take about 30 seconds on average to execute, this means
that it takes 5 minutes for a message to be processed.

That's a BIG problem in that I want to keep my latencies low!

And the BIG downside here is that a lot of my workers get their prefetch
buffer filled first, starving out other workers that do nothing...

This leads to massive starvation where some of my boxes are at 100% CPU and
others are at 10-20% starved for work.

So I'm working on a new design where by I use a listener, then I allow it
to prefetch and I use a countdown latch from within the message listener to
wait for the thread to process the message.  Then I commit the message.

This solves the over-prefetch problem because we don't attempt to pre-fetch
until the message is processed.

Since I can't commit each JMS message one at a time, I'm only left with
options that commit the whole session.  This forces me to set prefetch=1
otherwise I could commit() and then commit a message that is actually still
being processed.

This leaves me with a situation where I need to be clever about how I fetch
from the queue servers.

If I prefetch on ALL queue servers I'm kind of back to where I was to begin
with.

I was thinking of implementing this solution which should work and
minimizes all downsides.  Wanted feedback on this issue.

If I have say 1000 worker threads, what I do is allow up to 10% of the nr
of worker threads to be pre-fetched and stored in a local queue
(ArrayBlockingQueue).

In this example this would be 100 messages.

The problem now is how to we read in parallel from each server.

I think in this situation is that we then allow 10% of the buffered
messages from each queue server.

So in this case 10 from each.

so now we end up with a situation where we're allowed to prefetch 10
messages, each from each queue server, which can grow to hold 100 message.

The latency for processing this message would be the minimum average time
per task /thread being indexed which I think will keep the latencies low.

Also, I think this could be a common anti-pattern and solution to the
over-prefetch problem.

If you agree I'm willing to document the problem

Additionally, I think this comes close to the multi-headed ideal solution
according to queuing theory using multiple worker heads.  It just becomes
more interesting because we have imperfect
information from the queue servers so we have to make educated guesses
about their behavior.

Re: Dealing with the "over-prefetch" problem with large numbers of workers and many queue servers

Reply via email to