Do you have any calls to external data sources which might be increasing the latency and causing tuple timeout?
On Sun, Nov 1, 2015, at 04:49 AM, Renjie Liu wrote: > Yes, I've set it to 20000 > > On Sun, Nov 1, 2015 at 6:40 PM, Santosh Pingale > <pingalesant...@gmail.com> wrote: >> Have you set 'topology.*max*.*spout*.*pending'?* >> >> On Sun, Nov 1, 2015 at 2:26 PM, Renjie Liu >> <liurenjie2...@gmail.com> wrote: >>> Hi, storm community: >>> >>> We have a storm cluster deployed with 15 workers and recently we >>> often experience failure since ack timeout. Our input source is >>> kafka and we used ganglia to monitor our cluster. Recently we >>> experience failures every 12 hours and following are my observations >>> from some monitoring tools when problem happens: >>> 1. Topology page shows that no worker was down since uptime of each >>> task are nearly equal to topology uptime >>> 2. I've checked ganglia, the cpu report and mem report does not >>> give any clue about the problem. But network report shows >>> something unusual: the in speed decreases a little while the out >>> speed decreases to nearly zero on some workers. >>> 3. I've logged in to one of machines mentioned above, and found out >>> that one of the survivor areas always remains 100% full. >>> 4. dstat show that csw turns to 4k+ every few seconds while it >>> remains around 400 in normal condition. Can anyone give us some >>> hint about this problem? >> > > > > -- > Renjie Liu Department of Computer Science & Engineering Shanghai > JiaoTong University