Hey,
We are running a kafka-stream based app in production where the input,
intermediate and global topics have 36 partitions.
We have 17 sub-tasks (2 of them are for global stores so they won't
generate tasks).
More tech details:
6 machines with 16cpu's, 30 threads so: 6 * 30 = 180 stream-threads
15 * 36 = 540 tasks
3 tasks per thread

Every once in a while, during our rush-hours, some of the internal topics,
on specific partitions, start to lag - the lag usually keeps increasing
until i restart the application - and the lag disappears very quickly.

It seems like there is some problem in the work allocation since the
machines are not loaded at all, and have enough threads (more than double
the cpu's).

Any idea what's going on there?

-- 

Nitay Kufert
Backend Team Leader
[image: ironSource] <http://www.ironsrc.com>

email nita...@ironsrc.com
mobile +972-54-5480021
fax +972-77-5448273
skype nitay.kufert.ssa
121 Menachem Begin St., Tel Aviv, Israel
ironsrc.com <http://www.ironsrc.com>
[image: linkedin] <https://www.linkedin.com/company/ironsource> [image:
twitter] <https://twitter.com/ironsource> [image: facebook]
<https://www.facebook.com/ironSource> [image: googleplus]
<https://plus.google.com/+ironsrc>
This email (including any attachments) is for the sole use of the intended
recipient and may contain confidential information which may be protected
by legal privilege. If you are not the intended recipient, or the employee
or agent responsible for delivering it to the intended recipient, you are
hereby notified that any use, dissemination, distribution or copying of
this communication and/or its content is strictly prohibited. If you are
not the intended recipient, please immediately notify us by reply email or
by telephone, delete this email and destroy any copies. Thank you.

Reply via email to