Hey,
I can not find clear explanation of this in the documentation or in
hello-samza: how to tell collector to send my output data to a
particular partition of the output stream?
My understanding is that in my process() method I have to create
OutgoingMessageEnvelope object passing not only my deserialized data,
but also my partition key, like this:
[...]
try {
collector.send(new OutgoingMessageEnvelope(new
SystemStream("kafka", "output"), my_partition_key, null, my_data));
} catch (Exception e) {
System.err.println("Unable to parse line: " + event);
}
[...]
The question is: who is responsible for computing the partition number
based on my_partition_key? How, for example, I can establish some kind
of consistent hashing mechanism for computing the partition number based
on the key? Is it configurable somehow via task properties, like I may
do it in Kafka via partitioner.class property?
Many thanks in advance,
Vladimir
--
Vladimir Lebedev
http://linkedin.com/in/vlebedev