Hello everybody,

Thank you for the detailed answers. My issue is partly answered here:



*This rule also applies to disk-level, which means that when a set
ofpartitions assigned to a specific broker, each of the disks will get
thesame number of partitions without considering the load of disks at
thattime.*

 I admit, I didn't provide enough info either.

So my problem is that an existing topic got a huge surge of events for this
week. I knew that'll happen and I modified the partition count.
Unfortunately, it occurred to me a bit later, that I'll likely need some
extra disk space. So I added an extra disk to each broker. The thing I
didn't know, that Kafka won't evenly distribute the partitions on the disks.
So the question still remains:
 Is there any way to have Kafka evenly distribute data on its disks?
Also, what options do I have *after *I'm in the situation I described
above? (preferably without deleting the topic)

Thanks!

On Fri, Aug 7, 2020 at 12:00 PM Yingshuan Song <songyingsh...@gmail.com>
wrote:

> Hi Peter,
> Agreed with Manoj and Vinicius, i think those rules led to this result :
>
> 1)the partitions of a topic - N and replication number - R determine the
> real partition-replica count of this topic, which is N * R;
> 2)   kafka can distribute partitions evenly among brokers, but it is based
> on the broker count when the topic was created, this is important.
> If we create a topic (N - 4, R - 3) in a kafka cluster which contains 3
> kafka brokers, then 4 * 3 / 3 = 4 partitions will be assigned to each
> broker.
> But if a new broker was added into this cluster and another topic (N - 4, R
> - 3) need to be created, then 4 * 3 / 4 = 3 partitions will be assigned to
> each broker.
> Kafka will not assign all those partitions to the new added broker even
> though it is idle and i think this is a shortcoming of kafka.
> This rule also applies to disk-level, which means that when a set of
> partitions assigned to a specific broker, each of the disks will get the
> same number of partitions without considering the load of disks at that
> time.
> 3) when producer send records to topics, how to chose partiton : 3-1) if a
> record has a key, then the partition number calculate according to the key;
> 3-2) if  records have no keys, then those records will be sent to each
> partition in turns. So, if there are lots of records with the same key, and
> those records will be sent to the same partition, and may take up a lot of
> disk space.
>
>
> hope this helps
>
> Vinicius Scheidegger <vinicius.scheideg...@gmail.com> 于2020年8月7日周五
> 上午6:10写道:
>
> > Hi Peter,
> >
> > AFAIK, everything depends on:
> >
> > 1) How you have configured your topic
> >   a) number of partitions (here I understand you have 15 partitions)
> >   b) partition replication configuration (each partition necessarily has
> a
> > leader - primary responsible to hold the data - and for reads and writes)
> > you can configure the topic to have a number of replicas
> > 2) How you publish messages to the topic
> >   a) The publisher is responsible to choose the partition. This can be
> done
> > consciously (by setting the partition id while sending the message to the
> > topic) or unconsciously (by using the DefaultPartitioner or any other
> > partitioner scheme).
> >
> > All messages sent to a specific partition will be written first to the
> > leader (meaning that the disk configured for the partition leader will
> > receive the load) and then replicated to the replica (followers).
> > Kafka does not automatically distribute the data equally to the different
> > brokers - you need to think about your architecture having that in mind.
> >
> > I hope it helps
> >
> > On Thu, Aug 6, 2020 at 10:23 PM Péter Nagykátai <st4r.f1...@gmail.com>
> > wrote:
> >
> > > I initially started with one data disk (mounted solely to hold Kafka
> > data)
> > > and recently added a new one.
> > >
> > > On Thu, Aug 6, 2020 at 10:13 PM <manoj.agraw...@cognizant.com> wrote:
> > >
> > > > What do you mean older disk ?
> > > >
> > > > On 8/6/20, 12:05 PM, "Péter Nagykátai" <st4r.f1...@gmail.com>
> wrote:
> > > >
> > > >     [External]
> > > >
> > > >
> > > >     Yeah, but it doesn't do that. My "older" disks have ~70
> partitions,
> > > the
> > > >     newer ones ~5 partitions. That's why I'm asking what went wrong.
> > > >
> > > >     On Thu, Aug 6, 2020 at 8:35 PM <manoj.agraw...@cognizant.com>
> > wrote:
> > > >
> > > >     > Kafka  evenly distributed number of partition on each disk so
> in
> > > > your case
> > > >     > every disk should have 3/2 topic partitions .
> > > >     > It is producer job to evenly produce data by partition key  to
> > > topic
> > > >     > partition .
> > > >     > How it partition key , it is auto generated or producer sending
> > key
> > > > along
> > > >     > with message .
> > > >     >
> > > >     >
> > > >     > On 8/6/20, 7:29 AM, "Péter Nagykátai" <st4r.f1...@gmail.com>
> > > wrote:
> > > >     >
> > > >     >     [External]
> > > >     >
> > > >     >
> > > >     >     Hello,
> > > >     >
> > > >     >     I have a Kafka cluster with 3 brokers (v2.3.0) and each
> > broker
> > > > has 2
> > > >     > disks
> > > >     >     attached. I added a new topic (heavyweight) and was
> surprised
> > > > that
> > > >     > even if
> > > >     >     the topic has 15 partitions, those weren't distributed
> evenly
> > > on
> > > > the
> > > >     > disks.
> > > >     >     Thus I got one disk that's almost empty and the other
> almost
> > > > filled
> > > >     > up. Is
> > > >     >     there any way to have Kafka evenly distribute data on its
> > > disks?
> > > >     >
> > > >     >     Thank you!
> > > >     >
> > > >     >
> > > >     > This e-mail and any files transmitted with it are for the sole
> > use
> > > > of the
> > > >     > intended recipient(s) and may contain confidential and
> privileged
> > > >     > information. If you are not the intended recipient(s), please
> > reply
> > > > to the
> > > >     > sender and destroy all copies of the original message. Any
> > > > unauthorized
> > > >     > review, use, disclosure, dissemination, forwarding, printing or
> > > > copying of
> > > >     > this email, and/or any action taken in reliance on the contents
> > of
> > > > this
> > > >     > e-mail is strictly prohibited and may be unlawful. Where
> > permitted
> > > by
> > > >     > applicable law, this e-mail and other e-mail communications
> sent
> > to
> > > > and
> > > >     > from Cognizant e-mail addresses may be monitored.
> > > >     > This e-mail and any files transmitted with it are for the sole
> > use
> > > > of the
> > > >     > intended recipient(s) and may contain confidential and
> privileged
> > > >     > information. If you are not the intended recipient(s), please
> > reply
> > > > to the
> > > >     > sender and destroy all copies of the original message. Any
> > > > unauthorized
> > > >     > review, use, disclosure, dissemination, forwarding, printing or
> > > > copying of
> > > >     > this email, and/or any action taken in reliance on the contents
> > of
> > > > this
> > > >     > e-mail is strictly prohibited and may be unlawful. Where
> > permitted
> > > by
> > > >     > applicable law, this e-mail and other e-mail communications
> sent
> > to
> > > > and
> > > >     > from Cognizant e-mail addresses may be monitored.
> > > >     >
> > > >
> > > >
> > > > This e-mail and any files transmitted with it are for the sole use of
> > the
> > > > intended recipient(s) and may contain confidential and privileged
> > > > information. If you are not the intended recipient(s), please reply
> to
> > > the
> > > > sender and destroy all copies of the original message. Any
> unauthorized
> > > > review, use, disclosure, dissemination, forwarding, printing or
> copying
> > > of
> > > > this email, and/or any action taken in reliance on the contents of
> this
> > > > e-mail is strictly prohibited and may be unlawful. Where permitted by
> > > > applicable law, this e-mail and other e-mail communications sent to
> and
> > > > from Cognizant e-mail addresses may be monitored.
> > > > This e-mail and any files transmitted with it are for the sole use of
> > the
> > > > intended recipient(s) and may contain confidential and privileged
> > > > information. If you are not the intended recipient(s), please reply
> to
> > > the
> > > > sender and destroy all copies of the original message. Any
> unauthorized
> > > > review, use, disclosure, dissemination, forwarding, printing or
> copying
> > > of
> > > > this email, and/or any action taken in reliance on the contents of
> this
> > > > e-mail is strictly prohibited and may be unlawful. Where permitted by
> > > > applicable law, this e-mail and other e-mail communications sent to
> and
> > > > from Cognizant e-mail addresses may be monitored.
> > > >
> > >
> >
>

Reply via email to