Understanding Kafka for Event Processing system

masoom alam Mon, 05 Aug 2013 22:09:14 -0700

Hi Joe,

Many thanks for such a detailed response.



So you would have a topic called "TypeA" and then setup a consumer group
> and those consumers (if you really only needed 1 consumer set the
> partitions to 1) would get everything from the "TypeA" topic.  If you had
> more event types then just setup more topics and then consumers group for a
> consumer for those topics.
>
>
Got it, so we will have multiple topics. Each topic will be logically
associated with a set of consumers (Consumer Group). I was thinking that
this approach will be better off, in terms of efficiency by having each
individual topic associated with each consumer or consumer group.



> Now depending on what you need to-do you may need topics to not be by type
> perhaps be another value and in that case you can still "pin" data to a
> consumer in that case use Semantic Partitioning
> http://kafka.apache.org/design.html
> *
>  *Semantic partitioning*
>
> "Consider an application that would like to maintain an aggregation of the
> number of profile visitors for each member. It would like to send all
> profile visit events for a member to a particular partition and, hence,
> have all updates for a member to appear in the same stream for the same
> consumer thread. The producer has the capability to be able to semantically
> map messages to the available kafka nodes and partitions. This allows
> partitioning the stream of messages with some semantic partition function
> based on some key in the message to spread them over broker machines. The
> partitioning function can be customized by providing an implementation of
> the kafka.producer.Partitioner interface, default being the random
> partitioner. For the example above, the key would be member_id and the
> partitioning function would be hash(member_id)%num_partitions."
>
> If I am getting you correctly, the responsibility of mapping of events
will be on the shoulders of Producers right?. What if, we want to have a
function at the Kafka brokers nodes which actually performs the mapping. I
mean from the Producer side, if we want to make it transparent which event
will go to which consumer. Actually, in our scenario, we will have
producers at the client end, and brokers and consumers at our end.

Will this be a feasible approach? I am also thinking if we can include some
sort of load balancing at the Kafka broker nodes? That is depending on the
load of the consumers, the brokers writes the events to the respective
topics set for each consumer.



Thanks a lot for your time.


> Take a look at KeyedMessage.scala, the ConsoleProducer.scala example uses
> the overloaded constructor but you can use the constructor which makes the
> partitions random by instead passin in the key (the type) and also set
> your key-serializer to kafka.serializer.StringEncoder or make your own
>
> In this case you might need to have a partition for each topic unless you
> can have a consumer read different event types again, it all depends on
> your implementation of your event processing system.
>
> Hope this helps getting you started some, thanks!
>
> /*******************************************
>  Joe Stein
>  Founder, Principal Consultant
>  Big Data Open Source Security LLC
>  http://www.stealth.ly
>  Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
> ********************************************/
>
>
> On Mon, Aug 5, 2013 at 5:44 AM, masoom alam
> <masoom.a...@imsciences.edu.pk>wrote:
>
> > Hi every one,
> >
> > i am new to kafka and desiging an Event Processing System. Is this
> possible
> > that Kafka  Broker can do some event dependency handling so that for
> > example events of type A only goes to Consumer1.
> >
> > I hope I was able to explain my problem
> >
> > Thanks.
> >
>

Understanding Kafka for Event Processing system

Reply via email to