Re: Dismissing late messages in Kafka Streams

Michael Noll Thu, 20 Oct 2016 01:42:59 -0700

Nicolas,

> I set the maintain duration of the window to 30 days.
> If it consumes a message older than 30 days, then a new aggregate is created
for this old window.

I assume you mean:  If a message should have been included in the original
("old") window but that message happens to arrive late (after the
"original" 30 days), then a new aggregate is created for this old window?
I wanted to ask this first because answering your questions depends on what
exactly you mean here.

> The problem is that this old windowed aggregate is of course incomplete
and
> will overwrite a record in the final database.

Not sure I understand -- why would the old windowed aggregate be
incomplete?  Could you explain a bit more what you mean?

> By the way, is there any article about replaying old messages. Some tips
> and tricks, like "you'd better do that in another deployment of your
> topology", and/or "you'd better use topics dedicated to repair".

I am not aware of a deep dive article or docs on that just yet.  There's a
first blog post [1] about Kakfa's new Application Reset Tool that goes into
this direction, but this is only a first step into the direction of
replaying/reprocessing of old messages.  Do you have specific questions
here that we can help you with in the meantime?

[1]
http://www.confluent.io/blog/data-reprocessing-with-kafka-streams-resetting-a-streams-application

On Thu, Oct 20, 2016 at 9:40 AM, Nicolas Fouché <nfou...@onfocus.io> wrote:

> Hi,
>
> I aggregate some data with `aggregateByKey` and a `TimeWindows`.
>
> I set the maintain duration of the window to 30 days.
> If it consumes a message older than 30 days, then a new aggregate is
> created for this old window.
> The problem is that this old windowed aggregate is of course incomplete and
> will overwrite a record in the final database.
>
> So is there a way to dismiss these old messages ?
>
> I only see the point of accepting old messages when the topology is
> launched in "repair" mode.
> By the way, is there any article about replaying old messages. Some tips
> and tricks, like "you'd better do that in another deployment of your
> topology", and/or "you'd better use topics dedicated to repair".
>
> Thanks
> Nicolas
>

Re: Dismissing late messages in Kafka Streams

Reply via email to