What sort of skew do you expect.  For example do you expect one key to have 
1000x as many messages as others?

The consumer API allows you to pick a partition.  So if you know that you have 
N partition groups  then you could setup N consumers each pull from one 
partition in the group.  You could put a special message on the topic to tell 
the consumer to move to the next partition.  The hardest part will be having 
your producers all switch over at the same time and having one of them put a 
marker message on the topic.  Using time will have all sorts of race conditions.

-----Original Message-----
From: Matt Andruff [mailto:matt.andr...@gmail.com]
Sent: Wednesday, August 16, 2017 8:41 AM
To: users@kafka.apache.org
Subject: New Partition Strategy for Even Disk Usage

Good Day,

I'm looking for someone to poke holes in my theory.

I want to balance my disk usage across brokers.  I want to maintain order per 
partition.  Yes there are tools but they require manual intervention.
What if created a custom partition strategy.  The strategy is to take the 
existing partitioning strategy but add the ability to rotate the writing of 
partitions by 1.

If                 then          then          ect...

A -> A         A->B         A->C
B -> B         B->C         B->A
C -> C        C->A         C->B

The idea is to simply rotate the partition after some measure is reached.(Time 
sounds like the most likely way to do it, but to do it the 'correct way' to 
avoid race conditions would have to be part of the
strategy.)   This should help ensure that the ordering of the partitions is
maintained, but probably requires extra logic on the consumer side to undo the 
partitioning strategy and take advantage of the ordering.  This should help 
with a more balanced disk usage on a per topic level.

Has anyone tried this?  Is there a pitfall I should consider?
This e-mail and any files transmitted with it are confidential, may contain 
sensitive information, and are intended solely for the use of the individual or 
entity to whom they are addressed. If you have received this e-mail in error, 
please notify the sender by reply e-mail immediately and destroy all copies of 
the e-mail and any attachments.

Reply via email to