kafka-streams batch restore state

2018-11-28 Thread meigong.wang
When I restart my kafka-streams application, sometimes I can see following log: [Consumer clientId=market-kline-stream-wmg-ticker-d891eeaf-2932-45b8-b7e2-9720904a69a5-StreamThread-3-restore-consumer, groupId=] Resetting offset for partition market-kline-stream-wmg-ticker-KSTREAM-REDUCE-STATE-S

High CPU on kafka nodes

2018-11-28 Thread Alexander Filipchik
Hello! I inherited Kafka cluster which runs on AWS (I3.4xl instances). Each node ingests around 8k messages per second. Instance stats: Network in is around: 13 MB/s Network out is around: 50 MB/s Kafka stats show slightly different picture: counter-bytes-in: 2.7 MB/s counter-bytes-out: 2.9 MB/s

Re: Kafka streams consumer/producer throttling

2018-11-28 Thread Guozhang Wang
Hi Andrey, You're right that Streams does not allow users to directly set client ids for the embedded producer / consumer in order to maintain uniqueness. If you take a look at `StreamsConfig#getProducerConfigs` for example you'll see at the end we always override the client id configs. Right now

Re: Kafka Streams + Serverless

2018-11-28 Thread Guozhang Wang
Hello Artur, One common practice is to customize the state store suppliers to use external storage engines so that the Streams application instances themselves are still "stateless" under serverless while talking to a remote store for stateful operations. Guozhang On Fri, Nov 23, 2018 at 3:12 P

Re: Please explain Rest API

2018-11-28 Thread Ryanne Dolan
I think you might be looking for cURL? On Fri, Nov 23, 2018 at 10:28 AM Satendra Pratap Singh wrote: > Hi Ryan, > > Thanks. Since I am new to Kafka don’t understand how to configures rest > api and how to reconfigure connectors. In general I don’t know how to run > GET /connector command and whe

Re: Data loss - Compacted topic behind reduce function

2018-11-28 Thread Nitay Kufert
It just happened again and I have noticed it happen exactly when a new instance went up (another one crushed). Is it possible that it relates to the "rebalancing" of the instance? it losses the aggregated data for several seconds and in those seconds it gets traffic? Should I develop some logic on