Hi,

the idea from Mike and Nathan does not apply to your problem because in
your case the different execution times do not depend on the tuples but
on the executors. Thus, on the producer side you cannot separate "slow"
tuples from "fast" tuples.

If you can identify the "slow" executors, you can implement a
CustomGrouping.java strategy (or use .directGrouping() instead of
.shuffle()) to send less tuple to the slow executors.

-Matthias



On 06/13/2015 06:38 PM, Banias H wrote:
> I have a topology in which one of the bolts read from HBase. That bolt
> is setup to have one task per executor, and it got tuples from shuffle
> grouping so every executor of the bolt will have the same number of tuples.
> 
> The problem is that some executors will take longer than others (because
> of hot-spotting in HBase region servers). For example, 5% of the
> executors have latency of 100ms while the rest 95% have around 25ms. Now
> with the guarantee of equal number of tuples per executors, my
> understanding is that 95% of the fast executors will have to wait. Thus
> it brings down the throughput. Please correct me if I mis-understand.
> 
> Ideally, I would love to have a load-balance-biased shuffle grouping so
> that the 95% of the fast executors would get more tuples.
> 
> Is this something I can leverage other existing groupings or patterns to
> implement? In an earlier post entitled "long running bolts", Mike
> Thomsen and Nathan Leung discussed a nice idea of taking long running
> tuples elsewhere (paraphrase below):
> 
> "/create tasks for the bolt using the same class but different name in
> the topology... route long running bolts (without acks) to the separate
> instances and they will not affect your normal processing/"
> 
> Since my long running executors are not taking that long (just around
> 100ms), this may not be worth the effort to take the tuples elsewhere.
> 
> I would appreciate any comment and suggestions. Many thanks.
> 
> BH

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to