Hmm, then is it doable assigning non-overlapped different topics to each thread while implementing the Kafka Consumer with multi-threading?
2016-06-01 22:14 GMT+09:00 Christian Posta <christian.po...@gmail.com>: > Gerard is correct. > > The unit of parallelization in kafka is the topic and topic partition. A > single thread/consumer consumes each partition in a topic (even if multiple > topics). KafkaConsumer is NOT thread safe and should not be shared between > threads. > > On Wed, Jun 1, 2016 at 12:11 AM, Gerard Klijs <gerard.kl...@dizzit.com> > wrote: > > > If I understand it correctly each consumer should have it's 'own' thread, > > and should not be accessible from other threads. But you could > > (dynamically) create enough threads to cover all the partitions, so each > > consumer only reads from one partition. You could also let all those > > consumers access some threadsafe object if you need to combine the > result. > > In your linked example the consumers just each do there part, with solves > > the multi-threaded issue, but when you want to combine data from > different > > consumer threads it becomes more tricky. > > > > On Wed, Jun 1, 2016 at 2:57 AM BYEONG-GI KIM <bg...@bluedigm.com> wrote: > > > > > Hello. > > > > > > I've implemented a Kafka Consumer Application which consume large > number > > of > > > monitoring data from Kafka Broker and analyze those data accordingly. > > > > > > I referred to a guide, > > > > > > > > > http://www.confluent.io/blog/tutorial-getting-started-with-the-new-apache-kafka-0.9-consumer-client > > > , > > > since I thought the app needs to implement multi-threading for Kafka > > > Consumer per Topic. Actually, A topic is assigned to each open-source > > > monitoring software, e.g., Nagios, Collectd, etc., in order to > > distinguish > > > those because each of these uses its own message format, such as JSON, > > > String, and so on. > > > > > > There was, however, an Exception even though my source code for the > Kafka > > > Consumer are mostly copied and pasted from the guide; > > > *java.util.ConcurrentModificationException: > > > KafkaConsumer is not safe for multi-threaded access* > > > > > > First Question. Could the implementation in the guide really prevent > the > > > Exception? > > > > > > And second Question is, could the KafkaConsumer support such huge > amount > > of > > > data with one thread? The KafkaConsumer seems not thread-safe, and it > can > > > subscribe multi-topics at once. Do I need to change the implementation > > from > > > the multi-threaded to one-thread and subscribing multi-topics?... I'm > > just > > > wonder whether a KafkaConsumer is able to stand the bunch of data > without > > > performance degradation. > > > > > > Thanks in advance! > > > > > > Best regards > > > > > > KIM > > > > > > > > > -- > *Christian Posta* > twitter: @christianposta > http://www.christianposta.com/blog > http://fabric8.io >