Hi Jay,

Thanks for taking time for the details ! Appreciate that.

Just clarifying myself couple of things in the same line.

1) The new producer and consumer is being designed to take care of
auto balancing between partitions. Right ?

2) with the current available producer and consumer, is my current
setup(pls see attached file) a good design in terms of scalability.

Thanks,
Krishna

On Wednesday, March 19, 2014, Jay Kreps <jay.kr...@gmail.com> wrote:

> Hey Krishna,
>
> Let me clarify the current state of things a little:
> 1. Kafka offers a single producer interface as well as two consumer
> interfaces: the low-level "simple consumer" which just directly makes
> network requests, and the higher level interface which handles
> fault-tolerance, partition assignment, etc. These have been in all releases
> and not too much has changed with them.
> 2. Partitioning in the producer is controlled by the key specified with
> the message. This key is used to assign the message to a partition. This is
> the normal mechanism for balancing load. If no key is specified the
> producer will connect to a single broker at random and send its traffic
> there (to minimize the number of tcp connections). If you have many
> producers this will also balance traffic, but if you have just one it will
> not and you will want to specify some partitioning key (you can even just
> use a random number if you like). This behavior has really really confused
> people and seems to have been a mistake on our part.
>
> In an effort to simplify these interfaces as well as improve a lot of
> other things, we are working on a future replacement producer and consumer.
> The intention is that these will replace the existing clients (both the
> current producer, as well as the simple and high-level consumer). This is
> the KafkaProducer and KafkaConsumer discussion you are referring to. These
> are not yet available, code is being written right now. The producer is
> available in beta form on trunk if you want to try it out but the consumer
> does not yet exist so you definitely can't use that. :-)
>
> Hope that helps!
>
> Cheers,
>
> -Jay
>
> On Wed, Mar 19, 2014 at 2:33 AM, Krishna Raj 
> <reach.krishna...@gmail.com<javascript:_e(%7B%7D,'cvml','reach.krishna...@gmail.com');>
> > wrote:
>
>> Hello Experts & Kafka Team,
>>
>> Its existing to learn and work on using Kafka. I have been going through
>> lot of pages and Q&A.
>>
>> We are building an infra & topology using Kafka for events processing in
>> our application.
>>
>> We need some advice about designing the Producer and Consumer.
>>
>> *Please find attached file/below picture* of our current setup that we
>> are thinking of.
>>
>>
>> [image: Inline image 1]
>>
>> *1) Producer:*
>>
>> I understand that from 0.8.1, the message balancing is done in a fashion
>> that the broker will choose a partition after every meta refresh(the
>> default of which is 10 mins)
>>
>> Questions are:
>>
>> a.* Is there any other mechanism other than changing meta refresh ?* (I
>> understand that the logic implementation using custom class is no longer
>> supported in 0.8.1)
>>
>> b. We ultimately want the message to be evenly distributed across
>> partitions so that consumer's load is also evenly distributed thus paving
>> scalability & reduce lag and will help us scale easily as we can just add
>> partition with a corresponding consumer node attached to it. Is this
>> advised ? And to achieve this, *what is the optimized meta refresh time
>> without affecting performance ?*
>>
>> *2) Consumer*
>>
>> a.  I was in a though that SimpleConsumer has more flexibillity and
>> features. But after reading Neha's below JavaDoc, I am liking KafkaConsumer
>> features and the less need to handle at granular level. *What is the
>> adviced Consumer, SimpleCosumer or KafkaConsumer ?*
>>
>> Neha's KafkaCosumer JavaDoc:
>> http://people.apache.org/~nehanarkhede/kafka-0.9-consumer-javadoc/doc/kafka/clients/consumer/KafkaConsumer.html
>>
>> b. For keeping track of the Offset at each Consumer node, I am thinking
>> to manually control the Offset commit(to ensure that processing a message
>> is neither missed nor duplicated). On failure or exception, I would also
>> log the current Offset in a file or something before exiting so that when I
>> start my consumer again I can start from the Offset where I left. *Is
>> this a good design ?*
>>
>>
>> Thanks for the time and really appreciate the effort for making Kafka
>> amazing :)
>>
>> Thanks,
>> KR
>>
>>
>>
>

Reply via email to