usage of depricated method in kafka 2_12.1.0.0

2018-02-21 Thread pravin kumar
i have tried wikifeed example with Kafka 2_12.1.0.0.the count method is now depricated , previously in kafka_2.11-0.10.2.1 i have given count(localStateStoreName). how to give the statestore name in Kafka 2_12.1.0.0. i have attached the code below, package kafka.examples.wikifeed; import org.ap

RE: difference between key.serializer & default.key.serde

2018-02-21 Thread adrien ruffie
Hello Matthias, great thank for your response. I knew difference between deserializer --> consumer and serializer --> producer But I don't knew the differences between low and high API 😊 I take a notes. Thank again & best regards. Adrien De : Matthias J.

Doubts about multiple instance in kafka

2018-02-21 Thread pravin kumar
I have the Kafka confluent Document. But i cant understand the following line. "It is important to understand that Kafka Streams is not a resource manager, but a library that “runs” anywhere its stream processing application runs. Multiple instances of the application are executed either on the s

[VOTE] 1.0.1 RC2

2018-02-21 Thread Ewen Cheslack-Postava
Hello Kafka users, developers and client-developers, This is the third candidate for release of Apache Kafka 1.0.1. This is a bugfix release for the 1.0 branch that was first released with 1.0.0 about 3 months ago. We've fixed 49 issues since that release. Most of these are non-critical, but in a

Re: difference between key.serializer & default.key.serde

2018-02-21 Thread Matthias J. Sax
It's different abstractions use in different APIs. Consumer API: Only reads data (with a single type) and thus uses as deserializer and config `key.deserializer`. Producer API: Only writes data (with a single type) and thus uses a serializer and config `key.serializer`. Streams API: Reads a

Re: Error handling

2018-02-21 Thread Guozhang Wang
Hello Maria, You are welcome to read the faq section on AK web docs: https://cwiki.apache.org/confluence/display/KAFKA/FAQ And there are also corresponding sections on Confluent docs: https://docs.confluent.io/current/streams/faq.html#failure-and-exception-handling Guozhang On Thu, Feb 15,

difference between key.serializer & default.key.serde

2018-02-21 Thread adrien ruffie
Hello all I read the documentation but I not really understand the different between default.key.serde and key.serializer + key.deserializer and default.value.serde and value.serializer + value.deserializer I don't understand the differents usages ... Can you enlighten le a little more p

FINAL REMINDER: CFP for Apache EU Roadshow Closes 25th February

2018-02-21 Thread Sharan F
Hello Apache Supporters and Enthusiasts This is your FINAL reminder that the Call for Papers (CFP) for the Apache EU Roadshow is closing soon. Our Apache EU Roadshow will focus on Cloud, IoT, Apache Tomcat, Apache Http and will run from 13-14 June 2018 in Berlin. Note that the CFP deadline has

Re: commiting consumed offsets synchronously (every message)

2018-02-21 Thread Sönke Liebau
Kafka Streams would enable exactly once processing, yes. But this only holds true as long as your data stays in Kafka topics, as soon as you want to write data to an external system the exactly once guarantees don't hold true any more and you end up with the same issues - so I suspect that his woul

Building a bugfix branch and using it with existing distributions

2018-02-21 Thread Niek Peeters
Hi all, Recently, I experienced a bug in Connect (dist 1.0.0) only to find out it was registered and fixed already (KAFKA-6277). The fix got backported into the 1.0.0 branch but obviously the 1.0.0 release (and distribution) didn't change. Th

RE: commiting consumed offsets synchronously (every message)

2018-02-21 Thread Marasoiu, Nicu
Thank you very much, Would you think that Kafka-Streams with exactly_once flag enabled would perform better than kafka client with individual commit per message as timed below? Perhaps the implementation of exactly-once read-process-write is using other methods and its performance is better. Ind

Re: Doubts in KStreams

2018-02-21 Thread Bill Bejeck
Hi Pravin, 1. Fault tolerance means that state stores are backed by topics, changelogs, storing the contents of the state store. For example, in a worst case scenario, your machine crashed destroying all your local state, on starting your Kafka Streams application back up the state stores would

Re: commiting consumed offsets synchronously (every message)

2018-02-21 Thread Sönke Liebau
Hi Nicu, committing after every message and thus retrieving them with a batch size of 1 will definitely make a huge difference in performance! I've rigged a quick (and totally non academic) test which came up with the following numbers: Batching consumer - Consumed 1000490 records in 5 seconds No

commiting consumed offsets synchronously (every message)

2018-02-21 Thread Marasoiu, Nicu
Hi, In order to obtain an exactly-once semantics, we are thinking of doing at-least-once processing, and then have a compensation mechanism to fix the results in few minutes by correcting them by substracting the effects of the duplicates. However, in order to do that, it seems that at least thi

Doubts in KStreams

2018-02-21 Thread pravin kumar
I have studied KafkaStreams, but not clearly understood 1.Can someone explain about Fault tolerence. 2.I have topicA and topicB with 4 partitions, so it created fourTasks, I have created it in singleJVM.But i need to knw how it works in multiple JVM and if one jvm goes down,how it another jvm take

RE: broker properties explanations

2018-02-21 Thread adrien ruffie
It's really help me, thank Thomas ! Just a jot of another question, I don't understand correctly this property: leader.imbalance.per.broker.percentage The ratio of leader imbalance allowed per broker. The controller would trigger a leader balance if it goes above this value per broker. The

Re: Who is assigned to which partitions

2018-02-21 Thread Per Steffensen
On 21/02/18 08:42, Per Steffensen wrote: ..., when I will happy to do share? ..., THEN I will BE happy to share

Re: broker properties explanations

2018-02-21 Thread Thomas Aley
Hi Adrien, log.dirs exists to facilitate multiple data directories which allows more than one disk to be used without the need for RAID. This increases throughput but beware of naive load balancing that may fill up one disk way before another. When log.flush.interval.ms is null the log.flush.i

Re: Consumer group intermittently can not read any records from a cluster with 3 nodes that has one node down

2018-02-21 Thread Sandor Murakozi
hi Behrang, I recommend you to check out some docs that explain how partitions and replication work (e.g. https://sookocheff.com/post/kafka/kafka-in-a-nutshell/) What I'd highlight is that the partition leader and the controller are two different concepts. Each partition has its own leader and It'