You can use Kafka Streams:
https://kafka.apache.org/25/documentation/streams/

The use case you describe sounds like a perfect fit it. You would need
to use the `suppress()` operator:
https://kafka.apache.org/25/documentation/streams/developer-guide/dsl-api.html#window-final-results

>> My only concern is that consumer lag will be non-zero most of the time.

In fact, using Kafka Streams, your lag won't be non-zero most if the
time, because KS will maintain your intermediate result in a
fault-tolerant manner and it will commit in-between.

You could even get exactly-once processing if you chose to write you
result back into a Kafka topic (and enable exactly-once), and let the
external service consume the result topic.


-Matthias


On 7/12/20 3:03 AM, Alexey Vasiliev wrote:
> 
> Hi,
>  
> I’m trying to use kafka for the following use case:
> *  My app receive events from external source;
> *  It aggregates it during some window (say, 10 minutes);
> *  Every 10 minutes it writes this aggregated data to external service.
> I was thinking about using kafka for this, so I’ll write incoming events to a 
> kafka topic and then read events back, aggregating it and commiting only 
> after writing to external service occurs (so I’ll get at-least-once delivery).
> My only concern is that consumer lag will be non-zero most of the time.
> Is it an issue? Is there a better ways to aggregate data from kafka topic 
> with time window?
> Thanks.
>  
>  
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to