Thanks Jungtaek and Matthias. On Sun, Jun 14, 2015 at 8:15 AM, Matthias J. Sax < [email protected]> wrote:
> Hi, > > the idea from Mike and Nathan does not apply to your problem because in > your case the different execution times do not depend on the tuples but > on the executors. Thus, on the producer side you cannot separate "slow" > tuples from "fast" tuples. > > If you can identify the "slow" executors, you can implement a > CustomGrouping.java strategy (or use .directGrouping() instead of > .shuffle()) to send less tuple to the slow executors. > > -Matthias > > > > On 06/13/2015 06:38 PM, Banias H wrote: > > I have a topology in which one of the bolts read from HBase. That bolt > > is setup to have one task per executor, and it got tuples from shuffle > > grouping so every executor of the bolt will have the same number of > tuples. > > > > The problem is that some executors will take longer than others (because > > of hot-spotting in HBase region servers). For example, 5% of the > > executors have latency of 100ms while the rest 95% have around 25ms. Now > > with the guarantee of equal number of tuples per executors, my > > understanding is that 95% of the fast executors will have to wait. Thus > > it brings down the throughput. Please correct me if I mis-understand. > > > > Ideally, I would love to have a load-balance-biased shuffle grouping so > > that the 95% of the fast executors would get more tuples. > > > > Is this something I can leverage other existing groupings or patterns to > > implement? In an earlier post entitled "long running bolts", Mike > > Thomsen and Nathan Leung discussed a nice idea of taking long running > > tuples elsewhere (paraphrase below): > > > > "/create tasks for the bolt using the same class but different name in > > the topology... route long running bolts (without acks) to the separate > > instances and they will not affect your normal processing/" > > > > Since my long running executors are not taking that long (just around > > 100ms), this may not be worth the effort to take the tuples elsewhere. > > > > I would appreciate any comment and suggestions. Many thanks. > > > > BH > >
