Re: Same partition number of different Kafka topcs

2016-08-03 Thread Dana Powers
kafka-python by default uses the same partitioning algorithm as the Java client. If there are bugs, please let me know. I think the issue here is with the default nodejs partitioner. -Dana On Aug 3, 2016 7:03 PM, "Jack Huang" wrote: I see, thanks for the clarification. On Tue, Aug 2, 2016 at 10

Re: Same partition number of different Kafka topcs

2016-08-03 Thread Jack Huang
I see, thanks for the clarification. On Tue, Aug 2, 2016 at 10:07 PM, Ewen Cheslack-Postava wrote: > Jack, > > The partition is always selected by the client -- if it weren't the brokers > would need to forward requests since different partitions are handled by > different brokers. The only "def

Re: Same partition number of different Kafka topcs

2016-08-02 Thread Ewen Cheslack-Postava
Jack, The partition is always selected by the client -- if it weren't the brokers would need to forward requests since different partitions are handled by different brokers. The only "default Kafka partitioner" is the one that you could consider "standardized" by the Java client implementation. So

Re: Same partition number of different Kafka topcs

2016-07-29 Thread Jack Huang
Hi Gerard, After further digging, I found that the clients we are using also have different partitioner. The Python one uses murmur2 ( https://github.com/dpkp/kafka-python/blob/master/kafka/partitioner/default.py), and the NodeJS one uses its own impl ( https://github.com/SOHU-Co/kafka-node/blob/m

Re: Same partition number of different Kafka topcs

2016-07-29 Thread Gerard Klijs
The default partitioner will take the key, make the hash from it, and do a modulo operation to determine the partition it goes to. Some things which might cause it to and up different for different topics: - partition number are not the same (you already checked) - key is not exactly the same, for

Same partition number of different Kafka topcs

2016-07-28 Thread Jack Huang
Hi all, I have an application where I need to join events from two different topics. Every event is identified by an id, which is used as the key for the topic partition. After doing some experiment, I observed that events will go into different partitions even if the number of partitions for both