Hi, everyone. If you don't specify the partition, and do have a key, then the default > behaviour is to use a hash on the key to determine the partition. This to > make sure the messages with the same key and up on the same partition. This > helps to ensure ordering relative to the key/partition. Also when using > compaction instead of delete as cleanup policy the newest messages with the > same key are kept. This is also used for the internal __offset topic. >
I didnt know exactly if the messages with same key are placed in the same partition. Thats help me a lot. Thanks. Do you know if the hash funcion is MD5? Just curiours. > I'll have > > to look at the documentation but I'm not entirely sure if the consumers > > have access to this key. > Actually, they have access to the key[1]. [1] http://grokbase.com/t/kafka/users/135944gyke/key-used-by-producer > The producer does. You can override the default > > partitioner class and write one that uses your understands and interprets > > your definition of the key to place data in a specific partition. By > > default, I believe data is distributed using a round robin partitioner. > > > > > > > > On Thu, Mar 31, 2016 at 2:58 AM, Marcelo Oikawa < > > marcelo.oik...@webradar.com > > > wrote: > > > > > Hi, list. > > > > > > We're working on a project that uses Kafka and we notice that for every > > > message we have a key (or null). I searched for more info about the key > > > itself and the documentation says that it is only used to decide the > > > partition where the message is placed. > > > > > > Is there a problem if we use keys with the application semantics > > > (metadata)? For instance, we can use the key "origin:foo;target:boo" > and > > > the consumers may use the key info to make decisions. But, a lot of > > > messages may use the same key and it may produce unbalanced partitions, > > is > > > that right? > > > > > > Does anyone know more about the key and your role inside kafka? > > > > > > []s > > > > > > > > > > > -- > > -- > > Sharninder > > >