Guozhang,
Thanks for the clarification. It makes sense. As long as the output hits the
broker, zombies can be detected.
-mohan
On 5/3/21, 2:17 PM, "Guozhang Wang" wrote:
Hello Mohan,
Sorry for getting late on the thread. Just to revive your concerns here: if
in your topology
Hello Mohan,
Sorry for getting late on the thread. Just to revive your concerns here: if
in your topology there's no output at all to any topics (sink topics,
changelog topics), then yes the zombie would not be detected; but on the
other hand the topology itself is not make any visible changes to
Guozhang,
What does this mean if the changelog topic was disabled ? If thread 2 and
thread 4 are running in two different nodes and a rebalance occurs, thread 2
will not realize it is a zombie without the write to the changelog topic, right
? I am trying to understand the cases under which the
Hello Mangat,
I think using persistent store that relies on in-memory stores could help
if the threads are from the same instance.
Guozhang
On Tue, Apr 20, 2021 at 12:54 AM mangat rai wrote:
> Hey Guozhang,
>
> Thanks for creating the issue. Yes, you are right, this will happen only
> with the
Hey Guozhang,
Thanks for creating the issue. Yes, you are right, this will happen only
with the consecutive rebalancing as after some time zombie thread will stop
and re-join the group and the new thread will always overwrite the state
with the latest data. In our poor infra setup, the rebalancing
Hello Mangat,
What you've encountered is a "zombie writer" issue, that is, Thread-2 did
not know there's already a new rebalance and hence its partitions have been
migrated out, until it tries to commit and then got notified of the
illegal-generation error and realize itself is the "zombie" alread
Thanks, Guozhang,
I was knocking myself with Kafka's various consumer rebalancing algorithms
in the last 2 days. Could I generalize this problem as
*Any in-memory state store backed by a changelog topic will always risk
having interleaved writes from two different writers during rebalancing?*
I
Hi Mangat:
I think I found the issue of your problem here.
It seems thread-2's partition was assigned to thread-4 while thread-2 was
not aware (because it missed a rebalance, this is normal scenario); in
other words, thread2 becomes a "zombie". It would stay in that zombie state
until it tried to
Guozhang,
Yes, you are correct. We have our own group processor. I have more
information now.
1. I added ThreadId in the data when the app persists into the changelog
topic.
2. Thread-2 which was working with partition-0 had a timeout issue.
4. Thread-4 picked up this partition-0 as I can see its
Hey Mangat,
A. With at least once, Streams does not make sure atomicity of 1) / 2);
with exactly once, atomicity is indeed guaranteed with transactional
messaging.
B. If you are using processor API, then I'm assuming you did your own
group-by processor right? In that case, the partition key would
Thanks again, that makes things clear. I still have some questions here
then.
A. For each record we read, we do two updates
1. Changelog topic of the state store.
2. Output topic aka sink.
Does the Kafka stream app make sure that either both are committed or
neither?
B. Out In
Hi Mangat,
Please see my replies inline below.
On Thu, Apr 8, 2021 at 5:34 AM mangat rai wrote:
> @Guozhang Wang
>
> Thanks for the reply. Indeed I am finding it difficult to explain this
> state. I checked the code many times. There can be a bug but I fail to see
> it. There are several thing
@Guozhang Wang
Thanks for the reply. Indeed I am finding it difficult to explain this
state. I checked the code many times. There can be a bug but I fail to see
it. There are several things about the Kafka streams that I don't
understand, which makes it harder for me to reason.
1. What is the pa
Hello Mangat,
With at least once, although some records maybe processed multiple times
their process ordering should not be violated, so what you observed is not
expected. What caught my eyes are this section in your output changelogs
(high-lighted):
Key1, V1
Key1, null
Key1, V1
Key1, null (proc
Hey,
We have the following setup in our infrastructure.
1. Kafka - 2.5.1
2. Apps use kafka streams `org.apache.kafka` version 2.5.1 library
3. Low level processor API is used with *atleast-once* semantics
4. State stores are *in-memory* with *caching disabled* and *changelog
enable
15 matches
Mail list logo