Time is currently tracked per partition, not per key. There is a ticket
to add key-based time tracking already:
https://issues.apache.org/jira/browse/KAFKA-8769

On issue with key-based time tracking is the increased overhead; that's
why time is tracked per partition currently.

The only workaround you can apply is to increase the grace period.
Increasing `max.task.idle.ms` config may also help.


-Matthias

On 9/24/19 12:58 PM, Parthasarathy, Mohan wrote:
> Hi,
> 
> Here is a simple example:
> 
> Application is reading a topic where messages are being received from various 
> clients identified by “client_id”. The messages are grouped by “client_id” 
> and windowedBy 10 seconds with grace period of 5 seconds. The event time for 
> the stream progresses when we receive any message type from any of the 
> client. If one client is stuck sending data for more than the grace period 
> but other clients send data, the time progresses and the data from the client 
> that was stuck may never be processed again.
> 
> I am wondering why the event_time takes effect before the “groupBy”. If the 
> event_time was associated with the window (where there is one per 
> “client_id”), then it would have worked well in this case. Is there any 
> reason for the current design ? Is there any way to solve this problem ?
> 
> Thanks
> Mohan
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to