Hey,

I can not find clear explanation of this in the documentation or in hello-samza: how to tell collector to send my output data to a particular partition of the output stream?

My understanding is that in my process() method I have to create OutgoingMessageEnvelope object passing not only my deserialized data, but also my partition key, like this:

[...]
try {
collector.send(new OutgoingMessageEnvelope(new SystemStream("kafka", "output"), my_partition_key, null, my_data));
    } catch (Exception e) {
      System.err.println("Unable to parse line: " + event);
    }
[...]

The question is: who is responsible for computing the partition number based on my_partition_key? How, for example, I can establish some kind of consistent hashing mechanism for computing the partition number based on the key? Is it configurable somehow via task properties, like I may do it in Kafka via partitioner.class property?

Many thanks in advance,

Vladimir

--
Vladimir Lebedev
http://linkedin.com/in/vlebedev

Reply via email to