If you haven't looked at the offset ranges in the logs for the time period
in question, I'd start there.
On Jan 24, 2017 2:51 PM, "Hakan İlter" wrote:
Sorry for misunderstanding. When I said that, I meant there are no lag in
consumer. Kafka Manager shows each consumer's coverage and lag status.
Sorry for misunderstanding. When I said that, I meant there are no lag in
consumer. Kafka Manager shows each consumer's coverage and lag status.
On Tue, Jan 24, 2017 at 10:45 PM, Cody Koeninger wrote:
> When you said " I check the offset ranges from Kafka Manager and don't
> see any significant
When you said " I check the offset ranges from Kafka Manager and don't
see any significant deltas.", what were you comparing it against? The
offset ranges printed in spark logs?
On Tue, Jan 24, 2017 at 2:11 PM, Hakan İlter wrote:
> First of all, I can both see the "Input Rate" from Spark job's s
First of all, I can both see the "Input Rate" from Spark job's statistics
page and Kafka producer message/sec from Kafka manager. The numbers are
different when I have the problem. Normally these numbers are very near.
Besides, the job is an ETL job, it writes the results to Elastic Search. An
ano
I'm confused, if you don't see any difference between the offsets the
job is processing and the offsets available in kafka, then how do you
know it's processing less than all of the data?
On Tue, Jan 24, 2017 at 12:35 AM, Hakan İlter wrote:
> I'm using DirectStream as one stream for all topics. I
I'm using DirectStream as one stream for all topics. I check the offset
ranges from Kafka Manager and don't see any significant deltas.
On Tue, Jan 24, 2017 at 4:42 AM, Cody Koeninger wrote:
> Are you using receiver-based or direct stream?
>
> Are you doing 1 stream per topic, or 1 stream for al
Are you using receiver-based or direct stream?
Are you doing 1 stream per topic, or 1 stream for all topics?
If you're using the direct stream, the actual topics and offset ranges
should be visible in the logs, so you should be able to see more
detail about what's happening (e.g. all topics are s
Hi everyone,
I have a spark (1.6.0-cdh5.7.1) streaming job which receives data from
multiple kafka topics. After starting the job, everything works fine first
(like 700 req/sec) but after a while (couples of days or a week) it starts
processing only some part of the data (like 350 req/sec). When I