Ok, thanks for the information. Looking at the wire format for the metadata response, I see that the right hand side of the TopicMetadata production contains a TopicErrorCode, and the right hand side of the PartitionMetadata production contains a PartitionErrorCode. Are both of these 16-bit values? In general, where it isn't stated explicitly in the documentation, can I assume that all error codes are 16-bit values?
Thanks, Dave On Wed, May 22, 2013 at 4:29 PM, Neha Narkhede <neha.narkh...@gmail.com> wrote: > 1. Correct > 2. The producer does not use or depend on zookeeper anymore. It refreshes > its view of the cluster metadata by using a TopicMetadataRequest to any of > the kafka brokers. It maps a message to a partition using the following > rules - > 2.1 If a message has no key, use any available partition > 2.2 If a message has a key and the user has defined a custom partitioner, > use it to map the key to a partition id > 2.3 If a message has a key and the user has not defined a custom > partitioner, use the default hash based partitioner that ships with Kafka > > Thanks, > Neha > > > On Wed, May 22, 2013 at 1:33 PM, Dave Peterson <dspeter...@tagged.com>wrote: > >> Ok, the picture I have in my mind of how things work in 0.8 (from a >> producer's point of view) is as follows: >> >> 1. An application program sends log messages to a producer. Each >> message is provided as a key/value pair, where the key is chosen >> by the application and the value is the message contents. By its >> choice of key, the application may influence or control which >> partition the message gets sent to. >> >> 2. The producer receives messages as key/value pairs. From talking >> with zookeeper, it knows the set of available brokers and which >> partitions each broker has. If the sending application provided a >> key >> for a given message, the contents of the key may optionally >> influence the producer's choice of broker and partition to send the >> message to, according to some convention understood by both >> application program and producer. >> >> Is this correct? >> >> Thanks, >> Dave >> >> On Wed, May 22, 2013 at 9:28 AM, Jun Rao <jun...@gmail.com> wrote: >> > Dave, >> > >> > Currently, the broker expects each producer request to specify the exact >> > partition id (-1 is on longer valid). The mapping from a message to a >> > partition is done at the producer client. The producer can choose a >> random >> > partition (from the existing list of partitions) or deterministically >> > choose a partition based on the key. >> > >> > Thanks, >> > >> > Jun >> > >> > >> > On Tue, May 21, 2013 at 1:12 PM, Dave Peterson <dspeter...@tagged.com >> >wrote: >> > >> >> In my case, there is a load balancer between the producers and the >> >> brokers, so I want the behavior described for the Java client (null key >> >> specifies "any partition"). If the Key field of each individual message >> >> specifies the partition to send it to, then I don't understand the >> purpose >> >> of the 32-bit partition identifier that precedes each message set in a >> >> produce request: what if a produce request specifies "partition N" for a >> >> given message set, and then each individual message in the set >> >> specifies a different partition in its Key field? Also, the above- >> >> mentioned partition identifier is a 32-bit integer and the Key field of >> >> each individual message can contain data of arbitrary length, which >> >> seems inconsistent. Is a partition identifier a 32-bit integer, or can >> it >> >> be of arbitrary length? >> >> >> >> Thanks, >> >> Dave >> >> >> >> On Tue, May 21, 2013 at 12:30 PM, Neha Narkhede < >> neha.narkh...@gmail.com> >> >> wrote: >> >> > Dave, >> >> > >> >> > Colin described the producer behavior of picking the partition for a >> >> > message before it is sent to Kafka broker correctly. However, I'm >> >> > interested in knowing your use case a little before to see why you >> would >> >> > rather have the broker decide the partition? >> >> > >> >> > Thanks, >> >> > Neha >> >> > >> >> > >> >> > On Tue, May 21, 2013 at 12:05 PM, Colin Blower <cblo...@barracuda.com >> >> >wrote: >> >> > >> >> >> The key is used by the client to decide which partition to send the >> >> >> message to. By the time the client is creating the produce request, >> it >> >> >> should be known which partition each message is being sent to. I >> believe >> >> >> Neha described the behavior of the Java client which sends messages >> with >> >> >> a null key to any partition. >> >> >> >> >> >> The key is described in past tense because of the use case for >> >> >> persisting keys with messages. The key is persisted through the >> broker >> >> >> so that a consumer knows what key was used to partition the message >> on >> >> >> the producer side. >> >> >> >> >> >> I don't believe that you can have the broker decide which partition a >> >> >> message goes to. >> >> >> >> >> >> -- >> >> >> Colin B. >> >> >> >> >> >> On 05/21/2013 11:48 AM, Dave Peterson wrote: >> >> >> > I'm looking at the document entitled "A Guide to the Kafka >> Protocol" >> >> >> > located here: >> >> >> > >> >> >> > >> https://cwiki.apache.org/KAFKA/a-guide-to-the-kafka-protocol.html >> >> >> > >> >> >> > It shows a produce request as containing a number of message sets, >> >> which >> >> >> are >> >> >> > grouped first by topic and second by partition (a 32-bit integer). >> >> >> > However, each >> >> >> > message in a message set contains a Key field, which is described >> as >> >> >> follows: >> >> >> > >> >> >> > The key is an optional message key that was used for partition >> >> >> assignment. >> >> >> > The key can be null. >> >> >> > >> >> >> > I notice the use of "was" (past tense) above. That seems to >> suggest >> >> >> that the >> >> >> > Key field was once used to specify a partition (at the granularity >> of >> >> >> each >> >> >> > individual message), but the plan for the future is to instead use >> the >> >> >> 32-bit >> >> >> > partition value preceding each message set. Is this correct? If >> so, >> >> >> when I am >> >> >> > creating a produce request for 0.8, what should I use for the >> 32-bit >> >> >> partition >> >> >> > value, and how does this relate to the Key field of each individual >> >> >> message? >> >> >> > Ideally, I would like to just send a produce request and let the >> >> broker >> >> >> choose >> >> >> > the partition. How do I accomplish this in 0.8, and are there >> plans >> >> to >> >> >> change >> >> >> > this after 0.8? >> >> >> > >> >> >> > Thanks, >> >> >> > Dave >> >> >> > >> >> >> > On Tue, May 21, 2013 at 10:47 AM, Neha Narkhede < >> >> neha.narkh...@gmail.com> >> >> >> wrote: >> >> >> >> No. In 0.8, if you don't specify a key for a message, it is sent >> to >> >> any >> >> >> of >> >> >> >> the available partitions. In other words, the partition id is >> >> selected >> >> >> on >> >> >> >> the partition and the server doesn't get -1 as the partition id. >> >> >> >> >> >> >> >> Thanks, >> >> >> >> Neha >> >> >> >> >> >> >> >> >> >> >> >> On Tue, May 21, 2013 at 9:54 AM, Dave Peterson < >> >> dspeter...@tagged.com >> >> >> >wrote: >> >> >> >> >> >> >> >>> In the version 0.8 wire format for a produce request, does a >> value >> >> of >> >> >> -1 >> >> >> >>> still indicate "use a random partition" as it did for 0.7? >> >> >> >>> >> >> >> >>> Thanks, >> >> >> >>> Dave >> >> >> >>> >> >> >> >> >> >> >> >> >> >> >> >>