Re: Debugging Kafka Streams Windowing

2017-08-07 Thread Guozhang Wang
Thanks for your sharing Sahil, just FYI there is a KIP proposal for considering always turn on "log.cleaner.enable" here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-184%3A+Rename+LogCleaner+and+related+classes+to+LogCompactor Guozhang On Thu, Aug 3, 2017 at 5:58 AM, sahil aggarwal

Re: Debugging Kafka Streams Windowing

2017-08-03 Thread sahil aggarwal
Face the similar issue in kafka 0.10.0.1. Going through the kafka code figured that when coordinator goes down the other ISR scans whole log file of partition of __consumer_offsets for my consumer group to update the cache of offsets. In my case its size was around ~600G which took around ~40 mins

Re: Debugging Kafka Streams Windowing

2017-06-08 Thread Mahendra Kariya
Yes. To some extent. But the rebalancing is now taking a lot of time. There are situations where we have to manually restart the Streams app because rebalancing is kind of "stuck" for several minutes. On 7 June 2017 at 06:28, Garrett Barton wrote: > Mahendra, > > Did increasing those two proper

Re: Debugging Kafka Streams Windowing

2017-06-07 Thread Garrett Barton
Mahendra, Did increasing those two properties do the trick? I am running into this exact issue testing streams out on a single Kafka instance. Yet I can manually start a consumer and read the topics fine while its busy doing this dead stuffs. On Tue, May 23, 2017 at 12:30 AM, Mahendra Kariya <

Re: Debugging Kafka Streams Windowing

2017-05-22 Thread Mahendra Kariya
On 22 May 2017 at 16:09, Guozhang Wang wrote: > For > that issue I'd suspect that there is a network issue, or maybe the network > is just saturated already and the heartbeat request / response were not > exchanged in time between the consumer and the broker, or the sockets being > dropped becaus

Re: Debugging Kafka Streams Windowing

2017-05-22 Thread Guozhang Wang
Hi Manhendra, Sorry for the late reply. Just to clarify my previous reply was only for your question about: " There is also another issue where a particular broker is marked as dead for a group id and Streams process never recovers from this exception. " And I thought your attached logs are ass

Re: Debugging Kafka Streams Windowing

2017-05-16 Thread Mahendra Kariya
I am confused. If what you have mentioned is the case, then - Why would restarting the stream processes resolve the issue? - Why do we get these infinite stream of exceptions only on some boxes in the cluster and not all? - We have tens of other consumers running just fine. We see this

Re: Debugging Kafka Streams Windowing

2017-05-16 Thread Guozhang Wang
Sorry I mis-read your email and confused it with another thread. As for your observed issue, it seems "broker-05:6667" is in an unstable state which is the group coordinator for this stream process app with app id (i.e. group id) "grp_id". Since the streams app cannot commit offsets anymore due to

Re: Debugging Kafka Streams Windowing

2017-05-15 Thread Mahendra Kariya
Thanks for the reply Guozhang! But I think we are talking of 2 different issues here. KAFKA-5167 is for LockException. We face this issue intermittently, but not a lot. There is also another issue where a particular broker is marked as dead for a group id and Streams process never recovers from th

Re: Debugging Kafka Streams Windowing

2017-05-15 Thread Guozhang Wang
I'm wondering if it is possibly due to KAFKA-5167? In that case, the "other thread" will keep retrying on grabbing the lock. Guozhang On Sat, May 13, 2017 at 7:30 PM, Mahendra Kariya wrote: > Hi, > > There is no missing data. But the INFO level logs are infinite and the > streams practically s

Re: Debugging Kafka Streams Windowing

2017-05-13 Thread Mahendra Kariya
Hi, There is no missing data. But the INFO level logs are infinite and the streams practically stops. For the messages that I posted, we got these INFO logs for around 20 mins. After which we got an alert about no data being produced in the sink topic and we had to restart the streams processes.

Re: Debugging Kafka Streams Windowing

2017-05-13 Thread Matthias J. Sax
Hi, I just dug a little bit. The messages are logged at INFO level and thus should not be a problem if they go away by themselves after some time. Compare: https://groups.google.com/forum/#!topic/confluent-platform/A14dkPlDlv4 Do you still see missing data? -Matthias On 5/11/17 2:39 AM, Mahen

Re: Debugging Kafka Streams Windowing

2017-05-11 Thread Mahendra Kariya
Hi Matthias, We faced the issue again. The logs are below. 16:13:16.527 [StreamThread-7] INFO o.a.k.c.c.i.AbstractCoordinator - Marking the coordinator broker-05:6667 (id: 2147483642 rack: null) dead for group grp_id 16:13:16.543 [StreamThread-3] INFO o.a.k.c.c.i.AbstractCoordinator - Discovered

Re: Debugging Kafka Streams Windowing

2017-05-08 Thread Matthias J. Sax
Great! Glad 0.10.2.1 fixes it for you! -Matthias On 5/7/17 8:57 PM, Mahendra Kariya wrote: > Upgrading to 0.10.2.1 seems to have fixed the issue. > > Until now, we were looking at random 1 hour data to analyse the issue. Over > the weekend, we have written a simple test that will continuously ch

Re: Debugging Kafka Streams Windowing

2017-05-07 Thread Mahendra Kariya
Upgrading to 0.10.2.1 seems to have fixed the issue. Until now, we were looking at random 1 hour data to analyse the issue. Over the weekend, we have written a simple test that will continuously check for inconsistencies in real time and report if there is any issue. No issues have been reported

Re: Debugging Kafka Streams Windowing

2017-05-04 Thread Matthias J. Sax
About > 07:44:08.493 [StreamThread-10] INFO o.a.k.c.c.i.AbstractCoordinator - > Discovered coordinator broker-05:6667 for group group-2. Please upgrade to Streams 0.10.2.1 -- we fixed couple of bug and I would assume this issue is fixed, too. If not, please report back. > Another question that I

Re: Debugging Kafka Streams Windowing

2017-05-04 Thread Mahendra Kariya
Hi Matthias, Please find the answers below. I would recommend to double check the following: > > - can you confirm that the filter does not remove all data for those > time periods? > Filter does not remove all data. There is a lot of data coming in even after the filter stage. > - I would a

Re: Debugging Kafka Streams Windowing

2017-05-03 Thread Mahendra Kariya
Another question that I have is, is there a way for us detect how many messages have come out of order? And if possible, what is the delay? On Thu, May 4, 2017 at 6:17 AM, Mahendra Kariya wrote: > Hi Matthias, > > Sure we will look into this. In the meantime, we have run into another > issue. We

Re: Debugging Kafka Streams Windowing

2017-05-03 Thread Mahendra Kariya
Hi Matthias, Sure we will look into this. In the meantime, we have run into another issue. We have started getting this error frequently rather frequently and the Streams app is unable to recover from this. 07:44:08.493 [StreamThread-10] INFO o.a.k.c.c.i.AbstractCoordinator - Discovered coordinat

Re: Debugging Kafka Streams Windowing

2017-05-03 Thread Matthias J. Sax
I would recommend to double check the following: - can you confirm that the filter does not remove all data for those time periods? - I would also check input for your AggregatorFunction() -- does it receive everything? - same for .mapValues() This would help to understand in what part of the

Re: Debugging Kafka Streams Windowing

2017-05-02 Thread Mahendra Kariya
Hi Garrett, Thanks for these insights. But we are not consuming old data. We want the Streams app to run in near real time. And that is how it is actually running. The lag never increases beyond a certain limit. So I don't think that's an issue. The values of the configs that you are mentioning a

Re: Debugging Kafka Streams Windowing

2017-05-02 Thread Garrett Barton
Mahendra, One possible thing I have seen that exhibits the same behavior of missing windows of data is the configuration of the topics (internal and your own) retention policies. I was loading data that was fairly old (weeks) and using event time semantics as the record timestamp (custom timesta

Re: Debugging Kafka Streams Windowing

2017-05-02 Thread Mahendra Kariya
Hi Matthias, What we did was read the data from sink topic and print it to console. And here's the raw data from that topic (the counts are randomized). As we can see, the data is certainly missing for some time windows. For instance, after 1493693760, the next timestamp for which the data is pres

Re: Debugging Kafka Streams Windowing

2017-04-29 Thread Mahendra Kariya
Thanks for the update Matthias! And sorry for the delayed response. The reason we use .aggregate() is because we want to count the number of unique values for a particular field in the message. So, we just add that particular field's value in the HashSet and then take the size of the HashSet. On

Re: Debugging Kafka Streams Windowing

2017-04-29 Thread Matthias J. Sax
Just a follow up (we identified a bug in the "skipped records" metric). The reported value is not correct. On 4/28/17 9:12 PM, Matthias J. Sax wrote: > Ok. That makes sense. > > Question: why do you use .aggregate() instead of .count() ? > > Also, can you share the code of you AggregatorFunctio

Re: Debugging Kafka Streams Windowing

2017-04-28 Thread Matthias J. Sax
Ok. That makes sense. Question: why do you use .aggregate() instead of .count() ? Also, can you share the code of you AggregatorFunction()? Did you change any default setting of StreamsConfig? I have still no idea what could go wrong. Maybe you can run with log level TRACE? Maybe we can get some

Re: Debugging Kafka Streams Windowing

2017-04-27 Thread Mahendra Kariya
Oh good point! The reason why there is only one row corresponding to each time window is because it only contains the latest value for the time window. So what we did was we just dumped the data present in the sink topic to a db using an upsert query. The primary key of the table was time window.

Re: Debugging Kafka Streams Windowing

2017-04-27 Thread Matthias J. Sax
Thanks for the details (sorry that I forgot that you did share the output already). Might be a dumb question, but what is the count for missing windows in your seconds implementation? If there is no data for a window, it should not emit a window with count zero, but nothing. Thus, looking at you

Re: Debugging Kafka Streams Windowing

2017-04-27 Thread Mahendra Kariya
> Can you somehow verify your output? Do you mean the Kafka streams output? In the Kafka Streams output, we do see some missing values. I have attached the Kafka Streams output (for a few hours) in the very first email of this thread for reference. Let me also summarise what we have done so far.

Re: Debugging Kafka Streams Windowing

2017-04-27 Thread Matthias J. Sax
Thanks for reporting back! As Eno mentioned, we do have a JIRA for the same report as yours: https://issues.apache.org/jira/browse/KAFKA-5055 We are investigating... Can you somehow verify your output? Do you see missing values (not sure how easy it is to verify -- we are not sure yet if the repo

Re: Debugging Kafka Streams Windowing

2017-04-27 Thread Mahendra Kariya
Hi Matthias, We changed our timestamp extractor code to this. public long extract(ConsumerRecord record, long previousTimestamp) { Message message = (Message) record.value(); long timeInMillis = Timestamps.toMillis(message.getEventTimestamp()); if (timeInMillis < 0) { LOGGER.

Re: Debugging Kafka Streams Windowing

2017-04-27 Thread Matthias J. Sax
Streams skips records with timestamp -1 The metric you mentioned, reports the number of skipped record. Are you sure that `getEventTimestamp()` never returns -1 ? -Matthias On 4/27/17 10:33 AM, Mahendra Kariya wrote: > Hey Eno, > > We are using a custom TimeStampExtractor class. All messages

Re: Debugging Kafka Streams Windowing

2017-04-27 Thread Mahendra Kariya
Hey Eno, We are using a custom TimeStampExtractor class. All messages that we have in Kafka has a timestamp field. That is what we are using. The code looks like this. public long extract(ConsumerRecord record, long previousTimestamp) { Message message = (Message) record.value(); return T

Re: Debugging Kafka Streams Windowing

2017-04-27 Thread Eno Thereska
Hi Mahendra, We are currently looking at the skipped-records-rate metric as part of https://issues.apache.org/jira/browse/KAFKA-5055 . Could you let us know if you use any special TimeStampExtractor class, or if it is the default? Thanks Eno >

Debugging Kafka Streams Windowing

2017-04-27 Thread Mahendra Kariya
Hey All, We have a Kafka Streams application which ingests from a topic to which more than 15K messages are generated per second. The app filters a few of them, counts the number of unique filtered messages (based on one particular field) within a 1 min time window, and dumps it back to Kafka. Th