Re: [Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-17 Thread Matthias J. Sax
You cannot add a `Processor`. You can only use `aggregate() / reduce() / count()` (which of course will add a pre-defined processor). `groupByKey()` is really just a "meta operation" that checks if the key was changes upstream, and to insert a repartition/shuffle step if necessary. Thus, if y

Re: [Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-13 Thread Igor Maznitsa
Thanks a lot for explanation but could you provide a bit more details about KGroupedStream? It is just interface and not extends KStream so how I can add processor in the case below? /   KStream someStream = / /  someStream / /     .groupByKey() / */how to add processor for resulted grouped

Re: [Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-12 Thread Matthias J. Sax
`KGroupedStream` is just an "intermediate representation" to get a better flow in the DSL. It's not a "top level" abstraction like KStream/KTable. For `KTable` there is `transformValue()` -- there is no `transform()` because keying must be preserved -- if you want to change the keying you ne

[Kafka Streams], Processor API for KTable and KGroupedStream

2024-01-12 Thread Igor Maznitsa
Hello Is there any way in Kafka Streams API to define processors for KTable and KGroupedStream like KStream#transform? How to provide a custom processor for KTable or KGroupedStream which could for instance provide way to not downstream selected events? -- Igor Maznitsa email: rrg4...@gmail

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-30 Thread Guozhang Wang
Desai​ | Senior Software Developer | *ude...@itrsgroup.com* > > *www.itrsgroup.com* <https://www.itrsgroup.com/> > <https://www.itrsgroup.com/> > > *From: *Guozhang Wang > *Date: *Tuesday, March 30, 2021 at 2:10 PM > *To: *Users > *Cc: *Bart Lilje > *Subject: *Re: Ka

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-30 Thread Upesh Desai
From: *Guozhang Wang > *Date: *Tuesday, March 30, 2021 at 1:00 PM > *To: *Users > *Cc: *Bart Lilje > *Subject: *Re: Kafka Streams Processor API state stores not restored via > changelog topics > > Hello Upesh, > > These are super helpful logs, and I think I'm very

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-30 Thread Guozhang Wang
; > > Best, > > Upesh > > > Upesh Desai​ | Senior Software Developer | *ude...@itrsgroup.com* > > *www.itrsgroup.com* <https://www.itrsgroup.com/> > <https://www.itrsgroup.com/> > > *From: *Guozhang Wang > *Date: *Tuesday, March 30, 2021 at 1:0

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-30 Thread Upesh Desai
Desai | Senior Software Developer | ude...@itrsgroup.com www.itrsgroup.com From: Guozhang Wang Date: Tuesday, March 30, 2021 at 1:00 PM To: Users Cc: Bart Lilje Subject: Re: Kafka Streams Processor API state stores not restored via changelog topics Hello Upesh, These are super helpful logs

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-30 Thread Guozhang Wang
or give us more > insight into where this issue could originate from? > > > > Thanks, > Upesh > > > Upesh Desai​ | Senior Software Developer | *ude...@itrsgroup.com* > > *www.itrsgroup.com* <https://www.itrsgroup.com/> > <https://www.itrsgroup.com/> >

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-29 Thread Upesh Desai
Date: Thursday, March 25, 2021 at 6:46 PM To: users@kafka.apache.org Cc: Bart Lilje Subject: Re: Kafka Streams Processor API state stores not restored via changelog topics We have not tried running 2.6 brokers and 2.7 client, I will try and get back to you. We are not enabling EOS on the streams

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-25 Thread Upesh Desai
instantly. Best, Upesh Upesh Desai | Senior Software Developer | ude...@itrsgroup.com www.itrsgroup.com From: Guozhang Wang Date: Thursday, March 25, 2021 at 6:31 PM To: Users Subject: Re: Kafka Streams Processor API state stores not restored via changelog topics BTW, yes that indicates the

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-25 Thread Guozhang Wang
ppen when we run everything on Kafka version 2.6. >> >> >> >> Thanks, >> >> Upesh >> >> >> Upesh Desai​ | Senior Software Developer | *ude...@itrsgroup.com* >> >> *www.itrsgroup.com* <https://www.itrsgroup.com/> >> <https:

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-25 Thread Guozhang Wang
ww.itrsgroup.com/> > <https://www.itrsgroup.com/> > > *From: *Guozhang Wang > *Date: *Thursday, March 25, 2021 at 4:01 PM > *To: *Users > *Cc: *Bart Lilje > *Subject: *Re: Kafka Streams Processor API state stores not restored via > changelog topics > >

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-25 Thread Upesh Desai
Subject: Re: Kafka Streams Processor API state stores not restored via changelog topics Hello Upesh, Could you confirm a few more things for me: 1. After you stopped the application, and wiped out the state dir; check if the corresponding changelog topic has one record indeed at offset 0 --- this

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-25 Thread Guozhang Wang
gt; Upesh Desai​ | Senior Software Developer | *ude...@itrsgroup.com* > > *www.itrsgroup.com* <https://www.itrsgroup.com/> > <https://www.itrsgroup.com/> > > *From: *Guozhang Wang > *Date: *Wednesday, March 24, 2021 at 1:37 PM > *To: *Users > *Cc: *Bart Lilje > *S

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-24 Thread Upesh Desai
...@itrsgroup.com www.itrsgroup.com From: Guozhang Wang Date: Wednesday, March 24, 2021 at 1:37 PM To: Users Cc: Bart Lilje Subject: Re: Kafka Streams Processor API state stores not restored via changelog topics Hello Upesh, Thanks for the detailed report. I looked through the code and tried to

Re: Kafka Streams Processor API state stores not restored via changelog topics

2021-03-24 Thread Guozhang Wang
rote: > Hi all, > > > > Our team think we discovered a bug over the weekend withing the Kafka > Streams / Processor API. We are running 2.7.0. > > > > When configuring a state store backed by a changelog topic with the > cleanup policy configuration set to “compact,del

Kafka Streams Processor API state stores not restored via changelog topics

2021-03-23 Thread Upesh Desai
Hi all, Our team think we discovered a bug over the weekend withing the Kafka Streams / Processor API. We are running 2.7.0. When configuring a state store backed by a changelog topic with the cleanup policy configuration set to “compact,delete”: final StoreBuilder> store = Sto

Re: Kafka Streams & Processor API

2019-08-05 Thread Boyang Chen
Hey Navneeth, thank you for your interest in trying out Kafka Streams! Normally we will redirect new folks to the stream FAQ first in case you haven't checked it out. For details to your question: 1. Joining 2 topics using processor API (we cal

Kafka Streams & Processor API

2019-08-03 Thread Navneeth Krishnan
Hi All, I'm new to kafka streams and I'm trying to model my use case that is currently written on flink to kafka streams. I have couple of questions. - How can I join two kafka topics using the processor API? I have data stream and change stream which needs to be joined based on a key. - I read

Re: Kafka Streams processor node metrics process rate with multiple stream threads

2018-07-17 Thread Guozhang Wang
Thanks Sam! Please feel free to assign the ticket to yourself and I will review your PR if you created one: https://cwiki.apache.org/confluence/display/KAFKA/Contributing+Code+Changes#ContributingCodeChanges-PullRequest On Tue, Jul 17, 2018 at 6:29 PM, Sam Lendle wrote: > https://issues.apache.

Re: Kafka Streams processor node metrics process rate with multiple stream threads

2018-07-17 Thread Sam Lendle
https://issues.apache.org/jira/browse/KAFKA-7176 If I have a change I will give trunk a try. On 7/16/18, 2:14 PM, "Guozhang Wang" wrote: Hmm.. this seems new to me. Checked on the source code it seems right to me. Could you try out the latest trunk (build from source code) and see

Re: Kafka Streams processor node metrics process rate with multiple stream threads

2018-07-16 Thread Guozhang Wang
Hmm.. this seems new to me. Checked on the source code it seems right to me. Could you try out the latest trunk (build from source code) and see if it is the same issue for you? > In addition to that, though, I also see state store metrics for tasks that have been migrated to another instance, an

Re: Kafka Streams processor node metrics process rate with multiple stream threads

2018-07-12 Thread Sam Lendle
Ah great, thanks Gouzhang. I also noticed a similar issue with state store metrics, where rate metrics for each thread/task appear to be the total rate across all threads/tasks on that instance. In addition to that, though, I also see state store metrics for tasks that have been migrated to an

Re: Kafka Streams processor node metrics process rate with multiple stream threads

2018-07-11 Thread Guozhang Wang
Hello Sam, It is a known issue that should have been fixed in 2.0, the correlated fix has also been cherry-picked to the 1.1.1 bug fix release as well: https://github.com/apache/kafka/pull/5277 Guozhang On Wed, Jul 11, 2018 at 11:42 AM, Sam Lendle wrote: > Hello! > > Using kafka-streams 1.1.

Kafka Streams processor node metrics process rate with multiple stream threads

2018-07-11 Thread Sam Lendle
Hello! Using kafka-streams 1.1.0, I noticed when I sum the process rate metric for a given processor node, the rate is many times higher than the number of incoming messages. Digging further, it looks like the rate metric associated with each thread in a given application instance is always the

Re: Kafka streams Processor life cycle behavior of close()

2016-10-08 Thread Srikanth
Tnx! Looks like fix is already in for 0.10.1.0 On Tue, Oct 4, 2016 at 6:18 PM, Guozhang Wang wrote: > Created https://issues.apache.org/jira/browse/KAFKA-4253 for this issue. > > > Guozhang > > On Tue, Oct 4, 2016 at 3:08 PM, Guozhang Wang wrote: > > > Hello Srikanth, > > > > We close the under

Re: Kafka streams Processor life cycle behavior of close()

2016-10-04 Thread Guozhang Wang
Created https://issues.apache.org/jira/browse/KAFKA-4253 for this issue. Guozhang On Tue, Oct 4, 2016 at 3:08 PM, Guozhang Wang wrote: > Hello Srikanth, > > We close the underlying clients before closing the state manager (hence > the states) because for example we need to make sure producer's

Re: Kafka streams Processor life cycle behavior of close()

2016-10-04 Thread Guozhang Wang
Hello Srikanth, We close the underlying clients before closing the state manager (hence the states) because for example we need to make sure producer's sent records have all been acked before the state manager records the changelog sent offsets as end offsets. This is kind of chicken-and-egg probl

Kafka streams Processor life cycle behavior of close()

2016-10-01 Thread Srikanth
Hello, I'm testing out a WriteToSinkProcessor() that batches records before writing it to a sink. The actual commit to sink happens in punctuate(). I also wanted to commit in close(). Idea here is, during a regular shutdown, we'll commit all records and ideally stop with an empty state. My commit(

Re: Kafka Streams / Processor

2016-05-30 Thread Matthias J. Sax
2) I am not sure if I understand correctly * punctuation is independent from committing (ie, you cannot use it to flush) * if you need to align writes with commits you can either use a KStream/KTable or need to register a state (see StateStore.java) 5) The application goes down -- neither pr

Re: Kafka Streams / Processor

2016-05-26 Thread Tobias Adamson
Thank you. Some more follow-up questions 1) great, will do some tests 2) if auto commit is used how do we prevent a commit happening when an error happens in processing. Basically our scenario is that we build up aggregation contexts for specific keys (these are a bit special so most probably

Re: Kafka Streams / Processor

2016-05-26 Thread Matthias J. Sax
Hi Toby, 1) I am not an expert for RocksDB, but I don't see a problem with larger objects. 2) I assume, by "guaranteed" you mean that the commit is performed when the call return. In this case, no. It only sets a flag to commit at the next earliest point in time possible. Ie, you can trigger comm

Kafka Streams / Processor

2016-05-26 Thread Tobias Adamson
Hi We have a scenario where we could benefit from the new API’s instead of our in house ones. However we have a couple of questions 1. Is it feasible to save 2-3MB size values in the RocksDBStorage? 2. When is the offset committed as processed when using a custom Processor, is it when you call c