I ran into similar issue. I configured 3 disks, but partitions were
allocated only to 2 disks (disk2 and disk3). Then I found that the left out
disk (disk1) was already hosting lot number of other partitions from
different topics. So may be partition allocation happens based on "how many
partitions is disk is already hosting (from all topics)". Its just my
observation and guess.

Regards,
Vijay

On 2 June 2015 at 02:00, Jason Rosenberg <j...@squareup.com> wrote:

> Andrew Otto,
>
> This is a known problem (and which I have run into as well).  Generally, my
> solution has been to increase the number of partitions such that the
> granularity of partitions is much higher than the number of disks, such
> that its more unlikely for the imbalance to be significant.
>
> I would not recommend explicitly trying to game the system, by manually
> moving partitions and recovery files.  You could do something to cause it
> to recreate the replicas by having them recreated from scratch (e.g. use
> the partition reassignment tool to move it to a new broker and hope for a
> cleaner distribution).  Also, I've removed a log-dir from the 'log.dirs'
> list and restarted a broker when dealing with a failed disk (this will
> cause any data on the removed log.dir to be reassigned elsewhere, and the
> data will have to re-sync from replicas to fully recover).
>
> There is a 'KIP' about this issue, to make JBOD support in Kafka a bit more
> first-class, and I think this would be one of the main issues to solve.
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-18+-+JBOD+Support
>
> Jason
>
> On Wed, May 27, 2015 at 5:55 PM, Jonathan Creasy <jonathan.cre...@turn.com
> >
> wrote:
>
> > I have a similar issue, let me know how it goes. :)
> >
> > -----Original Message-----
> > From: Andrew Otto [mailto:ao...@wikimedia.org]
> > Sent: Wednesday, May 27, 2015 3:12 PM
> > To: users@kafka.apache.org
> > Subject: Kafka partitions unbalanced
> >
> > Hi all,
> >
> > I’ve recently noticed that our broker log.dirs are using up different
> > amounts of storage.  We use JBOD for our brokers, with 12 log.dirs, 1 on
> > each disk.  One of our topics is larger than the others, and has 12
> > partitions.  Replication factor is 3, and we have 4 brokers.  Each broker
> > then has to store 9 partitions for this topic (12*3/4 == 9).
> >
> > I guess I had originally assumed that Kafka would be smart enough to
> > spread partitions for a given topic across each of the log.dirs as evenly
> > as it could.  However, on some brokers this one topic has 2 partitions
> in a
> > single log.dir, meaning that the storage taken up on a single disk by
> this
> > topic on those brokers is twice what it should be.
> >
> > e.g.
> >
> > Filesystem      Size  Used Avail Use% Mounted on
> > /dev/sda3       1.8T  1.2T  622G  66% /var/spool/kafka/a
> > /dev/sdb3       1.8T  1.7T  134G  93% /var/spool/kafka/b
> > …
> > $ du -sh /var/spool/kafka/{a,b}/data/webrequest_upload-*
> > 501G    a/data/webrequest_upload-4
> > 500G    b/data/webrequest_upload-11
> > 501G    b/data/webrequest_upload-8
> >
> >
> > This also means that those over populated disks have more writes to do.
> > My I/O is imbalanced!
> >
> > This is sort of documented at http://kafka.apache.org/documentation.html
> <
> > http://kafka.apache.org/documentation.html>:
> >
> > "If you configure multiple data directories partitions will be assigned
> > round-robin to data directories. Each partition will be entirely in one
> of
> > the data directories. If data is not well balanced among partitions this
> > can lead to load imbalance between disks.”
> >
> > But my data is well balanced among partitions!  It’s just that multiple
> > partitions are assigned to a single disk.
> >
> > Anyyyyyyway, on to a question:  Is it possible to move partitions between
> > log.dirs?  Is there tooling to do so?  Poking around in there, it looks
> > like it might be as simple as shutting down the broker, moving the
> > partition directory, and then editing both replication-offset-checkpoint
> > and recovery-point-offset-checkpoint files so that they say the
> appropriate
> > things in the appropriate directories, and then restarting broker.
> >
> > Someone tell me that this is a horrible idea. :)
> >
> > -Ao
> >
> >
> >
>

Reply via email to