Re: KafkaStreams pause specific topic partition consumption

2017-05-08 Thread Timur Yusupov
Got it, thanks, Matthias! On Tue, May 9, 2017 at 2:07 AM, Matthias J. Sax wrote: > Yes. That is something you would need to do external too. > > There is a KIP for a tool > (https://cwiki.apache.org/confluence/display/KAFKA/KIP- > 122%3A+Add+Reset+Consumer+Group+Offsets+tooling) > -- but you can

Re: KafkaStreams pause specific topic partition consumption

2017-05-08 Thread Matthias J. Sax
Yes. That is something you would need to do external too. There is a KIP for a tool (https://cwiki.apache.org/confluence/display/KAFKA/KIP-122%3A+Add+Reset+Consumer+Group+Offsets+tooling) -- but you can also do this using a single `KafkaConsumer` with `group.id == application.id` that gets all par

Re: KafkaStreams pause specific topic partition consumption

2017-05-08 Thread Timur Yusupov
That means in order to process filtered out records in a next batch, we have to seek KafkaStreams back, right? On Tue, May 9, 2017 at 1:19 AM, Matthias J. Sax wrote: > I see. > > I you do the step of storing the end offsets in your database before > starting up Streams this would work. > > What

Re: KafkaStreams pause specific topic partition consumption

2017-05-08 Thread Matthias J. Sax
I see. I you do the step of storing the end offsets in your database before starting up Streams this would work. What you could do as a work around (even if it might not be a nice solution), is to apply a `transform()` as your first operator. Within `transfrom()` you get access to there current r

Re: KafkaStreams pause specific topic partition consumption

2017-05-08 Thread Timur Yusupov
Matthias, Thanks for your answers. >> So we are considering to just pause specific >> topic partitions as soon as we arrive to stop offsets for them. >I am just wondering how you would do this in a fault-tolerant way (if you would have pause API)? On start of batch cycle we have to store somewher

Re: KafkaStreams pause specific topic partition consumption

2017-04-27 Thread Matthias J. Sax
Timur, there is not API to pause/resume partitions in Streams, because Streams handles/manages its internal consumer by itself. The "batch processing KIP" is currently delayed -- but I am sure we will pick it up again. Hopefully after 0.11 got released. > So we are considering to just pause spec