Hi Derek,

What max.task.idle.ms does is set a wait time for the stream application to
wait for new messages when one or more
input topics have no messages after a poll. In your case, the application
polls for the first time and finds a message on
topic A ("input") and no messages on topic B ("table"). Since you have
max.task.idle.ms set to 5000, it waits up to five
seconds for messages to arrive on topic B. When you produce your second
message three seconds later, the app will
process the messages.

Kafka Streams will process the messages from both topics in timestamp
order, but since both of your messages have
the same timestamp, my guess is that it's processing the message from the
stream ("input") first and it finds nothing to
join to on "table". My guess is that if you tweak the second message
timestamp to be lower than the timestamp of the
second message you'll get the result you want.

Regards,
Alexandre

On Mon, Aug 29, 2022 at 11:55 AM Derek Mok <derek.mok9...@gmail.com> wrote:

> Hi, I'd like some help with understanding how exactly max.task.idle.ms
> works. I have a topology that consumes from an input topic A, and a join
> operator that enriches the topic A messages with a KTable from a compacted
> topic B. The enriched messages are output to topic C.
>
> If I set max.task.idle.ms to 5000, what should be the expected behaviour
> in
> the following scenario (assuming all messages have the same key and topics
> have same partition count):
>
>    1. Message published to A with timestamp t_0
>    2. Three seconds of wall clock time elapses
>    3. Message published to B with timestamp t_0
>
> My understanding is that a message should be output to topic C containing
> the enriched result after max.task.idle.ms elapses since it should account
> for the late producer of topic B. The join operator should only be invoked
> after max.task.idle.ms has elapsed. However, what actually happens is that
> nothing is published in topic C. I suspect my understanding of
> max.task.idle.ms is not entirely correct so would appreciate any insight
> here!
>
> Here is a small project demonstrating the above
> https://github.com/ThousandEyes-Derek/max-task-idle-ms
>
> MaxTaskIdleStreamsApp contains the streams application that describes the
> topology.
> MessageProducer produces messages into the input and compacted topics based
> on the scenario I described above.
>
> Thanks!
> Derek
>

Reply via email to