I agree that there is overhead. It also happens when there are multiple 
applications producing into a single topic and if some application 
crashes/stops running for some time, then application consuming from this topic 
see a similar problem. I

Though it might be application specific, are there any guidelines on what the 
associated overhead is with the grace period and is it reasonable for it to be 
in hours ? I guess there is state in memory, rocksdb, Anything else to worry 
about ? 

Thanks
Mohan

On 9/25/19, 11:25 PM, "Matthias J. Sax" <matth...@confluent.io> wrote:

    Time is currently tracked per partition, not per key. There is a ticket
    to add key-based time tracking already:
    https://issues.apache.org/jira/browse/KAFKA-8769
    
    On issue with key-based time tracking is the increased overhead; that's
    why time is tracked per partition currently.
    
    The only workaround you can apply is to increase the grace period.
    Increasing `max.task.idle.ms` config may also help.
    
    
    -Matthias
    
    On 9/24/19 12:58 PM, Parthasarathy, Mohan wrote:
    > Hi,
    > 
    > Here is a simple example:
    > 
    > Application is reading a topic where messages are being received from 
various clients identified by “client_id”. The messages are grouped by 
“client_id” and windowedBy 10 seconds with grace period of 5 seconds. The event 
time for the stream progresses when we receive any message type from any of the 
client. If one client is stuck sending data for more than the grace period but 
other clients send data, the time progresses and the data from the client that 
was stuck may never be processed again.
    > 
    > I am wondering why the event_time takes effect before the “groupBy”. If 
the event_time was associated with the window (where there is one per 
“client_id”), then it would have worked well in this case. Is there any reason 
for the current design ? Is there any way to solve this problem ?
    > 
    > Thanks
    > Mohan
    > 
    
    

Reply via email to