Re: Kafka Streams can't run normally after restart/redeployment

2019-09-30 Thread Alex Brekken
You could try increasing retries and see if that helps as well as adjusting the producer batch size to a lower value. (I think the retries default is Integer.MAX when you're on kafka streams version 2.1 or higher so you can definitely increase it beyond 5). Additionally you could look at the " de

Re: Kafka Streams can't run normally after restart/redeployment

2019-09-26 Thread Xiyuan Hu
Thanks Alex! Some updates: I tried to restart service with staging pool, which has far less traffic as production environment. And after restart, the application works fine without issues. I assume I can't restart the service in production, is caused by the huge lag in production? The lag is mostl

Re: Kafka Streams can't run normally after restart/redeployment

2019-09-26 Thread Alex Brekken
1. Yeah I'm not sure why restarting is causing you problems. You shouldn't be changing your application ID just to get data flowing ,so something is wrong there I'm just not sure what. 2. Lag on the source topic? I guess that depends on how long your application is down and how quickly it can

Re: Kafka Streams can't run normally after restart/redeployment

2019-09-26 Thread Xiyuan Hu
Thanks for the reply! I have two questions: 1) No output issue only happens when I restart/redeployment the application with the same application Id. But when I run the application first time, it works fine. Thus, I assume suppress() is working fine, at least fine for the first run. The thing I ca

Re: Kafka Streams can't run normally after restart/redeployment

2019-09-26 Thread Alex Brekken
So I'm not exactly sure why supress() isn't working for you, because it should send out a final message once the window closes - assuming you're still getting new messages flowing through the topology. Have you tried using the count function in KGroupedTable? It should handle duplicates correctly.

Re: Kafka Streams can't run normally after restart/redeployment

2019-09-25 Thread Xiyuan Hu
Hi, The first selectKey/groupBy/windowedBy/reduce is to group messages by key and drop duplicated messages based on the new key, so that for each 1hr time window, each key will only populate 1 message. I use suppress() is to make sure only the latest message per time window will be sent. The seco

Re: Kafka Streams can't run normally after restart/redeployment

2019-09-25 Thread Alex Brekken
You might want to try temporarily commenting the suppress() call just to see if that's the cause of the issue. That said, what is the goal of this topology? It looks like you're trying to produce a count at the end for a key. Is the windowedBy() and suppress() there just to eliminate duplicates, o

Re: Kafka Streams can't run normally after restart/redeployment

2019-09-25 Thread Xiyuan Hu
Hi Alex, Thanks for the reply! Yes. After deploy with same application ID, source topic has new messages and the application is consuming them but no output at the end. suppress call is: .suppress(untilWindowCloses(Suppressed.BufferConfig.unbounded())) Topology is like below: final KStream sourc

Re: Kafka Streams can't run normally after restart/redeployment

2019-09-25 Thread Alex Brekken
Hi Xiyuan, just to clarify: after you restart the application (using the same application ID as previously) there ARE new messages in the source topic and your application IS consuming them, but you're not seeing any output at the end? How are you configuring your suppress() call? Is it possible

Re: Kafka Streams can't run normally after restart/redeployment

2019-09-25 Thread Xiyuan Hu
Hi, If I change application id, it will start process new messages I assume? The old data will be dropped. But this solution will not work during production deployment, since we can't change application id for each release. My code looks like below: builder.stream(topicName) .mapValues() stream.

Re: Kafka Streams can't run normally after restart/redeployment

2019-09-25 Thread Boyang Chen
Hey Xiyuan, I would assume it's easier for us to help you by reading your application with a full paste of code (a prototype). Changing application id would work suggests that re-process all the data again shall work, do I understand that correctly? Boyang On Wed, Sep 25, 2019 at 8:16 AM Xiyuan

Kafka Streams can't run normally after restart/redeployment

2019-09-25 Thread Xiyuan Hu
Hi, I'm running a Kafka streams app(v2.1.0) with windowed function(reduce and suppress). One thing I noticed is, every time when I redeployment or restart the application, I have to change the application ID to a new one, otherwise, only the reduce-repartition internal topic has input traffic(and