Is there a typo below?  Are all of these actually in the same topic, just 
different partitions?  Partitioning, AFAIK, is mainly done for parallelism & 
throughput reasons.  What is the reason for partitioning your dataset by 
‘columns’?

https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-HowdoIchoosethenumberofpartitionsforatopic?

Richard

> On Mar 26, 2015, at 8:22 AM, Shekar Tippur <ctip...@gmail.com> wrote:
>
> Hello,
>
> Want to confirm a basic understanding of Kafka.
> If I have a dataset that needs to be partitioned by 4 columns, then the
> progression is
>
> {topic1:partition_key1} -> {Group by samza on partition_key1}
> ->
> {topic2:partition_key2} -> {Group by samza on partition_key2}
> ->
> {topic3:partition_key3} -> {Group by samza on partition_key3}
> ->
> {topic4:partition_key4} -> {Group by samza on partition_key4}
>
> Can you please confirm if my understanding is right?
>
> - Shekar


________________________________

This email and any attachments may contain confidential and privileged material 
for the sole use of the intended recipient. Any review, copying, or 
distribution of this email (or any attachments) by others is prohibited. If you 
are not the intended recipient, please contact the sender immediately and 
permanently delete this email and any attachments. No employee or agent of TiVo 
Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc. by 
email. Binding agreements with TiVo Inc. may only be made by a signed written 
agreement.

Reply via email to