The re-discover new consumer member within the group is part of the
Consumer Rebalance protocol that Streams simply relies on. More details can
be found here:
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Client-side+Assignment+Proposal
A one sentence summary is that the new consumer wi
> Any reason why you don't just let streams create the changelog topic? Yes
you should partition it the same as the source topic.
Only reason is that I need to use my max.message.bytes and in version
0.10.0.1 configuring the same to state store supplier is not supported.
But I understood that numb
Hi Sachin,
On 3 February 2017 at 09:07, Sachin Mittal wrote:
>
> 1. Now what I wanted to know is that for separate machines running same
> instance of the streams application, my application.id would be same
> right.
>
That is correct.
> If yes then how does kafka cluser know which partition
Hello All,
I am revisiting this topic as now I am actually configuring a partitioned
topic and would like multiple threads of my streams application running on
different instances to process this partitioned topic in parallel.
So I have once source topic partitioned into 40 partitions.
The message
> 1. Do we need to call system exit in UncaughtExceptionHandler.
There is no *need* for it. It really depends what you *want* to do.
System.exit is quite a hard termination of your application. Usually, if
your Streams part of you application dies, you still want to clean up
the other parts of you
Hi,
Thanks for the suggestions. Before running the streams application in a
standby cluster I was trying to get the graceful shutdown right.
I have code something like this
streams.setUncaughtExceptionHandler(new
Thread.UncaughtExceptionHandler() {
public void uncaughtException(
In the scenario you mention above about max.poll.interval.ms, yes if the
timeout was reached then there would be a rebalance and one of the standby
tasks would take over. However the original task may still be processing
the data when the rebalance occurs and would throw an exception when it
tries
Understood.
Also the line
Thanks for the application. It is not clear that clustering depends on how
source topics are partitioned.
Should be read as
Thanks for the explanation. It is now clear that clustering depends on how
source topics are partitioned.
Apologies for auto-correct.
One think I
Hi Sachin,
The KafkaStreams StreamsPartitionAssignor will take care of assigning the
Standby Tasks to the other instances of your Kafka Streams application. The
state store updates are all handled by reading from the change-logs and
updating local copies, there is no communication required between
Hi,
Thanks for the application. It is not clear that clustering depends on how
source topics are partitioned.
In our case I guess num.standby.replicas settings is best suited.
If say I set this to 2 and run two more same application in two different
machines, how would my original instance know in
About failure and restart. Kafka Streams does not provide any tooling
for this. It's a library.
However, because it is a library it is also agnostic to whatever tool
you want to use. You can for example you a resource manager like Mesos
or YARN, or you containerize your application, or you use too
Hi Sachin,
What you have suggested will never happen. If there is only 1 partition
there will only ever be one consumer of that partition. So if you had 2
instances of your streams application, and only a single input partition,
only 1 instance would be processing the data.
If you are running like
Hi,
I followed the document and I have few questions.
Say I have a single partition input key topic and say I run 2 streams
application from machine1 and machine2.
Both the application have same application id are have identical code.
Say topic1 has messages like
(k1, v11)
(k1, v12)
(k1, v13)
(k2,
Hi Sachin,
Some quick answers, and a link to some documentation to read more:
- If you restart the application, it will start from the point it crashed
(possibly reprocessing a small window of records).
- You can run more than one instance of the application. They'll
coordinate by virtue of bei
Hi All,
We were able to run a stream processing application against a fairly decent
load of messages in production environment.
To make the system robust say the stream processing application crashes, is
there a way to make it auto start from the point when it crashed?
Also is there any concept l
15 matches
Mail list logo