In my case, there is a load balancer between the producers and the brokers, so I want the behavior described for the Java client (null key specifies "any partition"). If the Key field of each individual message specifies the partition to send it to, then I don't understand the purpose of the 32-bit partition identifier that precedes each message set in a produce request: what if a produce request specifies "partition N" for a given message set, and then each individual message in the set specifies a different partition in its Key field? Also, the above- mentioned partition identifier is a 32-bit integer and the Key field of each individual message can contain data of arbitrary length, which seems inconsistent. Is a partition identifier a 32-bit integer, or can it be of arbitrary length?
Thanks, Dave On Tue, May 21, 2013 at 12:30 PM, Neha Narkhede <neha.narkh...@gmail.com> wrote: > Dave, > > Colin described the producer behavior of picking the partition for a > message before it is sent to Kafka broker correctly. However, I'm > interested in knowing your use case a little before to see why you would > rather have the broker decide the partition? > > Thanks, > Neha > > > On Tue, May 21, 2013 at 12:05 PM, Colin Blower <cblo...@barracuda.com>wrote: > >> The key is used by the client to decide which partition to send the >> message to. By the time the client is creating the produce request, it >> should be known which partition each message is being sent to. I believe >> Neha described the behavior of the Java client which sends messages with >> a null key to any partition. >> >> The key is described in past tense because of the use case for >> persisting keys with messages. The key is persisted through the broker >> so that a consumer knows what key was used to partition the message on >> the producer side. >> >> I don't believe that you can have the broker decide which partition a >> message goes to. >> >> -- >> Colin B. >> >> On 05/21/2013 11:48 AM, Dave Peterson wrote: >> > I'm looking at the document entitled "A Guide to the Kafka Protocol" >> > located here: >> > >> > https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html >> > >> > It shows a produce request as containing a number of message sets, which >> are >> > grouped first by topic and second by partition (a 32-bit integer). >> > However, each >> > message in a message set contains a Key field, which is described as >> follows: >> > >> > The key is an optional message key that was used for partition >> assignment. >> > The key can be null. >> > >> > I notice the use of "was" (past tense) above. That seems to suggest >> that the >> > Key field was once used to specify a partition (at the granularity of >> each >> > individual message), but the plan for the future is to instead use the >> 32-bit >> > partition value preceding each message set. Is this correct? If so, >> when I am >> > creating a produce request for 0.8, what should I use for the 32-bit >> partition >> > value, and how does this relate to the Key field of each individual >> message? >> > Ideally, I would like to just send a produce request and let the broker >> choose >> > the partition. How do I accomplish this in 0.8, and are there plans to >> change >> > this after 0.8? >> > >> > Thanks, >> > Dave >> > >> > On Tue, May 21, 2013 at 10:47 AM, Neha Narkhede <neha.narkh...@gmail.com> >> wrote: >> >> No. In 0.8, if you don't specify a key for a message, it is sent to any >> of >> >> the available partitions. In other words, the partition id is selected >> on >> >> the partition and the server doesn't get -1 as the partition id. >> >> >> >> Thanks, >> >> Neha >> >> >> >> >> >> On Tue, May 21, 2013 at 9:54 AM, Dave Peterson <dspeter...@tagged.com >> >wrote: >> >> >> >>> In the version 0.8 wire format for a produce request, does a value of >> -1 >> >>> still indicate "use a random partition" as it did for 0.7? >> >>> >> >>> Thanks, >> >>> Dave >> >>> >> >> >>