Hi Jay, Sorry about the KIP formatting . I fixed those in the KIP.
2. We certainly need to add both the serialized and unserialized form for the key as both are useful. I added those to the interface. 3. Do we need to add the value? I suspect people will have uses for computing something off a few fields in the value to choose the partition. This would be useful in cases where the key was being used for log compaction purposes and did not contain the full information for computing the partition. added it as well. 4. This interface doesn't include either an init() or close() method. It should implement Closable and Configurable, right? I am not quite sure about having init() or close() for partitioner. Are we looking at partitioner using some external resources to initialize and close. If thats the case than init() should also take in some config as param, this can add more complexity. 5. What happens if the user both sets the partition id in the ProducerRecord and sets a partitioner? Does the partition id just get passed in to the partitioner (as sort of implied in this interface?). This is a bit weird since if you pass in the partition id you kind of expect it to get used, right? Or is it the case that if you specify a partition the partitioner isn't used at all (in which case no point in including partition in the Partitioner api). In current Producer Record partition id is getting passed to Partitioner. If a custom partitioner is not going to use that than thats up to their implementation right. Similarly in our interface we’ve Value as another param this may or may not be used. Essentially its up to the Partitioner to disclose on what available information they are going to partition against. Thanks, Harsha On April 23, 2015 at 9:11:33 AM, Jay Kreps (jay.kr...@gmail.com) wrote: Hey Harsha, A few comments: Can you finish up the KIP there are some unfinished sentences and odd whitespace things going on. Here are the questions I think we should consider: 1. Do we need this at all given that we have the partition argument in ProducerRecord which gives full control? I think we do need it because this is a way to plug in a different partitioning strategy at run time and do it in a fairly transparent way. 2. We certainly need to add both the serialized and unserialized form for the key as both are useful. 3. Do we need to add the value? I suspect people will have uses for computing something off a few fields in the value to choose the partition. This would be useful in cases where the key was being used for log compaction purposes and did not contain the full information for computing the partition. 4. This interface doesn't include either an init() or close() method. It should implement Closable and Configurable, right? 5. What happens if the user both sets the partition id in the ProducerRecord and sets a partitioner? Does the partition id just get passed in to the partitioner (as sort of implied in this interface?). This is a bit weird since if you pass in the partition id you kind of expect it to get used, right? Or is it the case that if you specify a partition the partitioner isn't used at all (in which case no point in including partition in the Partitioner api). Cheers, -Jay On Thu, Apr 23, 2015 at 6:55 AM, Sriharsha Chintalapani <ka...@harsha.io> wrote: > Hi, > Here is the KIP for adding a partitioner interface for producer. > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-+22+-+Expose+a+Partitioner+interface+in+the+new+producer > > There is one open question about how interface should look like. Please > take a look and let me know if you prefer one way or the other. > > Thanks, > Harsha > >