Hello Ankit,
Kafka Streams's rebalance protocol is trying to balance workloads based on
the num.partitions (more specifically, the num.tasks which is derived from
the input partitions) but not on the num.messages or num.bytes, so they
would not be able to handle data-skewness across partitions unf
Hello kafka-users,
I have 50 topics, each with 32 partitions where data is being ingested
continuously.
Data is being published in these 50 partitions externally (no control)
which causes data skew amount the partitions of each topic.
For example: For topic-1, partition-1 contains 100 events, wh
Hi Guozhang,
Thanks for the suggestions below. I consider we got past the REBALANCING
issue. However, we are running into significant memory usage issue. I
will open a separate thread for this.
1) During the punctuate we require to perform certain tasks and it was
exceeding the consumer reques
Hello Siva,
To better understand your situation, I'd need to ask a few more questions:
1) What triggers your REBALANCING event?
2) Does your application contain any states? If yes, how are they
configured (persistent or in-memory, is logging enabled, etc)?
3) What is your commit interval config
Hi,
Kafka version 1.0.0 (can't upgrade to another version yet due to legacy
dependency)
The stream application uses low level processor API and maintains state. A
topic is setup with 30 partitions and I had split to 2 stream application
instances consuming the same topic, each with 15 threads.
lt; M since it will result in idle
threads.
Guozhang
On Tue, Mar 21, 2017 at 4:56 PM, Matthias J. Sax
wrote:
> Hi,
>
> I guess, it's currently not possible to load balance between different
> machines. It might be a nice optimization to add into Streams though.
>
> Right now, y
Hi,
I guess, it's currently not possible to load balance between different
machines. It might be a nice optimization to add into Streams though.
Right now, you should reduce the number of threads. Load balancing is
based on threads, and thus, if Streams place tasks to all threads of one
ma
Hey,
I have a typical scenario of a kafka-streams application in a production
environment.
We have a kafka-cluster with multiple topics. Messages from one topic is being
consumed by a the kafka-streams application. The topic, currently, has 9
partitions. We have configured consumer thread coun
and have set
>auto.leader.rebalance.enabled to true or false, you might not need to do
>further workload balance.
>
>However, in most cases you probably still need to do some sort of load
>balancing based on the traffic and disk utilization of each broker. You
>might want to do le
lance.enabled to true or false, you might not need to do
>further workload balance.
>
>However, in most cases you probably still need to do some sort of load
>balancing based on the traffic and disk utilization of each broker. You
>might want to do leader migration and/or partition re
If you have pretty balanced traffic on each partition and have set
auto.leader.rebalance.enabled to true or false, you might not need to do
further workload balance.
However, in most cases you probably still need to do some sort of load
balancing based on the traffic and disk utilization of each
not leader for a particular
partition will eventually be in-sync with the leader for a particular
partition. So, I don't think you need to worry about sending your messages
to VIP and having to direct where messages end up with manual
load-balancing, even if your messages are assigned to a par
Hi all,
Do I need to load balance against the brokers? I am using the python
driver and it seems to only want a single kafka broker host. However, in a
situation where I have 10 brokers, is it still fine to just give it one
host. Does zookeeper and kafka handle the load balancing and redirect
There are two algorithms: range and round robin.
Range algorithm does balance for each topic independently.
Round robin balance across all the topics the consumer is consuming from.
Jiangjie (Becket) Qin
On 3/2/15, 2:05 AM, "sunil kalva" wrote:
>Is kafka load balancing based
Is kafka load balancing based on number of partitions of a topic or number
partitions of all topics in a cluster ?
--
SunilKalva
What's the output of the ConsumerOffsetChecker tool?
Thanks,
Jun
On Tue, Oct 28, 2014 at 7:31 AM, Natarajan, Murugavel <
murugavel.natara...@softwareag.com> wrote:
> Hi,
>
> I have the following Kafka Setup
> Number of producer : 1
> Number of topics : 1
> Number of partitions : 2
> Number of c
Hi,
I have the following Kafka Setup
Number of producer : 1
Number of topics : 1
Number of partitions : 2
Number of consumers : 3 (with same group id)
Number of Kafka cluster : none(single Kafka server)
Zookeeper.session.timeout : 1000
Producer produces messages without any specific partitioning
With SimpleConsumer, you will have to handle leader discovery as well as
zookeeper based rebalancing. You can see an example here -
https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example
On Wed, Oct 8, 2014 at 11:45 AM, Sharninder wrote:
> Thanks Gwen. This really helped.
Thanks Gwen. This really helped.
Yes, Kafka is the best thing ever :)
Now how would this be done with the Simple consumer? I'm guessing I'll have
to maintain my own state in Zookeeper or something of that sort?
On Thu, Oct 9, 2014 at 12:01 AM, Gwen Shapira wrote:
> Here's an example (from Con
Here's an example (from ConsumerOffsetChecker tool) of 1 topic (t1)
and 1 consumer group (flume), each of the 3 topic partitions is being
read by a different machine running the flume consumer:
Group Topic Pid Offset
logSize Lag Owner
flume
yep. exactly.
On Wed, Oct 8, 2014 at 11:07 AM, Sharninder wrote:
> Thanks Gwen.
>
> When you're saying that I can add consumers to the same group, does that
> also hold true if those consumers are running on different machines? Or in
> different JVMs?
>
> --
> Sharninder
>
>
> On Wed, Oct 8, 2014
If you use the high level consumer implementation, and register all
consumers as part of the same group - they will load-balance
automatically.
When you add a consumer to the group, if there are enough partitions
in the topic, some of the partitions will be assigned to the new
consumer.
When a con
Thanks Gwen.
When you're saying that I can add consumers to the same group, does that
also hold true if those consumers are running on different machines? Or in
different JVMs?
--
Sharninder
On Wed, Oct 8, 2014 at 11:35 PM, Gwen Shapira wrote:
> If you use the high level consumer implementati
Hi,
I'm not even sure if this is a valid use-case, but I really wanted to run
it by you guys. How do I load balance my consumers? For example, if my
consumer machine is under load, I'd like to spin up another VM with another
consumer process to keep reading messages off any topic. On similar lines
Currently, we distribute partitions to consumers on a per topic basis. So,
in your cases, consumer 1 will get all the data.
Thanks,
Jun
On Tue, Jun 3, 2014 at 3:11 PM, Weide Zhang wrote:
> Hi,
>
> I have a question regarding load balancing within a consumer group.
>
> Say I
Hi,
I have a question regarding load balancing within a consumer group.
Say I have a consumer group of 4 consumers which subscribe to 4 topics ,
each of which have one partition. Will there be rebalancing happening on
topic level ? Or I will expect consumer 1 have all the data ?
Weide
The behavior that you described is explained here -
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyisdatanotevenlydistributedamongpartitionswhenapartitioningkeyisnotspecified
?
Thanks,
Neha
On Mon, Mar 17, 2014 at 6:26 PM, Abhinav Anand wrote:
> Hi,
> I am using the kafka produc
Hi,
I am using the kafka producer 0.8. Each producer seem to be sending
messages only to a specific broker until metadata refresh. Also I find each
producer thread connected to only one broker at once.
I had read that producer send messages in round robin fashion. Is there
some specific configura
The consumer load balancing logic today is pretty simple. It just tries to
divide the partitions evenly among the consumers. It doesn't try to balance
by load.
Thanks,
Jun
On Thu, Nov 14, 2013 at 3:34 PM, hsy...@gmail.com wrote:
> Hi,
>
> I have questions about the load bala
Hi,
I have questions about the load balancing of kafka high-level consumer
Suppose I have 4 partition
And the producer throughput to these 4 partitions are like this
01 23
10MB/s 10MB/s 1MB/s1MB/s
1kMsg/s
1,3 Take a look at request.required.acks in
http://kafka.apache.org/documentation.html#producerconfigs
2. The producer does random distribution by default. However, you can
provide a partitioning key and a partitioning function. For details on how
consumer load balancing works, see
http
Hi,
Following need more elaboration after reading kafka docs:
1- In a scenario during leader fails over, what happens to messages that
are not committed to other followers and to the messages that producer keep
in sending (in async mode) till new leader is elected. Can the producer
buffer thes
Thanks Guozhang.
On 22 August 2013 19:11, Guozhang Wang wrote:
> Hello Michal,
>
> This FAQ entry may help you understanding the rebalance logic:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-CanIpredicttheresultsoftheconsumerrebabalance%3F
>
> In a word, since we use a determ
Hello Michal,
This FAQ entry may help you understanding the rebalance logic:
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-CanIpredicttheresultsoftheconsumerrebabalance%3F
In a word, since we use a deterministic range partition in rebalance,
unevenly or customized partition assignmen
Hi, is it possible or has anybody tried/needed to balance partitions
between consumers unevenly or based on some custom function ? Ideally with
Kafka 0.7
Michal Haris
Jan,
My understanding is that most c++ client implementations use broker list,
instead of ZK for load balancing.
Thanks,
Jun
On Wed, Aug 7, 2013 at 8:06 AM, Jan Rudert wrote:
> Hi,
>
> I am starting with kafka. We use version 0.7.2 currently. Does anyone know
> wether automa
Hi,
I am starting with kafka. We use version 0.7.2 currently. Does anyone know
wether automatic producer load balancing based on zookeeper is supported by
the c++ client?
Thank you!
-- Jan
Hi,
I am starting with kafka. We use version 0.7.2 currently. Does anyone know
wether automatic producer load balancing based on zookeeper is supported by
the c++ client?
Thank you!
-- Jan
On Wed, Jul 10, 2013 at 10:55 PM, Ryan Chan
> wrote:
> >
> > > We are already using "zk.connect" to connect zookeeper and registered
> > > multiple brokers (same topic/partitions), so when a consumer request
> ZK,
> > is
> > > load balancing already done?
> > >
> > >
> > >
> > > Thanks
> > >
> >
>
han wrote:
>
> > We are already using "zk.connect" to connect zookeeper and registered
> > multiple brokers (same topic/partitions), so when a consumer request ZK,
> is
> > load balancing already done?
> >
> >
> >
> > Thanks
> >
>
Yes.
Thanks,
Jun
On Wed, Jul 10, 2013 at 10:55 PM, Ryan Chan wrote:
> We are already using "zk.connect" to connect zookeeper and registered
> multiple brokers (same topic/partitions), so when a consumer request ZK, is
> load balancing already done?
>
>
>
> Thanks
>
We are already using "zk.connect" to connect zookeeper and registered
multiple brokers (same topic/partitions), so when a consumer request ZK, is
load balancing already done?
Thanks
>When the producer
>tries to send to the old broker (which is either dead, or a slave now),
>the broker will either not respond, or the response will contain an error
>code.
In this case, the broker sends a response with an error code to the
producer, and then the producer retries the metadata r
rtain partition according to its key word. That is to say ,a certain
>must be sent to a fixed partition on a fixed broker. How the so called
>load balancing works?
>
>Best Regards
44 matches
Mail list logo