topic lag

Prateek Maheshwari Fri, 08 Jun 2018 15:00:36 -0700

Just to clarify, when you say you tried single threaded mode, do you
mean that you set job.container.thread.pool.size = 1, or that you set
job.container.single.thread.mode = true?


On Fri, Jun 8, 2018 at 2:53 PM, Thunder Stumpges <tstump...@ntent.com> wrote:
> Thanks for the quick reply. That sounds very much like what I'm seeing. I'm 
> merging in 0.14.1 to our branch now. I did try single threaded mode and 
> unfortunately that didn't seem to make a significant difference. Perhaps I do 
> need some multithreading? I'm seeing a task latency 0.2ms per message but 
> still only achieve ~700/sec
>
>
> -----Original Message-----
> From: Prateek Maheshwari [mailto:prateek...@gmail.com]
> Sent: Friday, June 8, 2018 13:54
> To: dev@samza.apache.org
> Subject: Re: Urgent : Help with latency / backlog / topic lag
>
> Hi Thunder,
>
>> What we believe may be happening is that most of the topics have no
> backlog, but one topic has all the backlog (this is because one of the topics 
> accounts for ~60% of the total message rate).  Could there be something 
> inducing extra latency on processing the one topic with a backlog just having 
> a bunch of other topics with NO backlog?
> This seems very similar to this issue:
> https://issues.apache.org/jira/browse/SAMZA-1599
> This was fixed in https://github.com/apache/samza/pull/436, and the fix 
> should be available in the 0.14.1 version.
> Would it be possible to try upgrading to 0.14.1? It should be backwards 
> compatible with 0.14.0.
>
> For something you can try without upgrading: try setting 
> "job.container.single.thread.mode" to true. From the configuration reference
> <https://samza.apache.org/learn/documentation/latest/jobs/configuration-table.html>:
> "If set to true, samza will fallback to legacy single-threaded event loop.
> Default is false, which enables the multithreading execution."
>
> Let us know if this doesn't help.
>
> Thanks,
> Prateek
>
> On Fri, Jun 8, 2018 at 1:35 PM, Thunder Stumpges <tstump...@ntent.com>
> wrote:
>
>> We have a new samza job which we just put into production. This job
>> processes many topics (~30) but the total rate is not that high
>> (~1200/sec in aggregate). I am unable to get above ~700/sec and have a 
>> growing backlog.
>>
>> We are running samza 0.12 (I have an update to 0.14 that is not tested
>> or pushed yet).  When we load tested with a single topic, we could
>> easily do several thousand per second. The latency of a single message
>> is about 0.5ms as recorded by our timer metric on our 'process' call.
>>
>> What we believe may be happening is that most of the topics have no
>> backlog, but one topic has all the backlog (this is because one of the
>> topics accounts for ~60% of the total message rate).  Could there be
>> something inducing extra latency on processing the one topic with a
>> backlog just having a bunch of other topics with NO backlog?
>>
>> Some things I have tried:
>>
>>
>>   1.  Increasing thread pool (10->20->30), no change
>>   2.  Going from 1 container to 2, no help (the two containers run at
>> half the speed and total is the same)
>>   3.  Increasing task.max.concurrency from 1 -> 2 -> 3  (this had some
>> minor help going from 1 to 2, but not enough)
>>   4.  Increasing fetch.threshold.bytes (currently at 100,000 and we
>> have pretty small messages)
>>
>> Some observed metrics:
>>
>>
>>   *   "Pending Messages" are > 0  (15+ on some partitions)
>>   *   "Messages in flight" is almost always 0
>>   *   Polls rate is ~50/sec
>>   *   Message chooser "Choos Obj" is ~680-700/sec like our processing rate
>>   *   Message chooser "choose null" is ~50/sec
>>
>> I'm somewhat at a loss because based on the actual processing latency
>> we should easily be able to do 2000+ with just a small handful of threads.
>>
>> Thanks in advance, this is in production I really need a solution.
>> Thunder
>>
>>

Re: Urgent : Help with latency / backlog / topic lag

Reply via email to