From just looking at your description of the problem, I'd say yes, this looks 
like a typical scenario for Kafka Streams. Kafka Streams supports exactly once 
semantics too in 0.11.

Cheers
Eno

> On 12 Jul 2017, at 17:06, Stephen Powis <spo...@salesforce.com> wrote:
> 
> Hey! I was hoping I could get some input from people more experienced with
> Kafka Streams to determine if they'd be a good use case/solution for me.
> 
> I have multi-tenant clients submitting data to a Kafka topic that they want
> ETL'd to a third party service.  I'd like to batch and group these by
> tenant over a time window, somewhere between 1 and 5 minutes.  At the end
> of a time window then issue an API request to the third party service for
> each tenant sending the batch of data over.
> 
> Other points of note:
> - Ideally we'd have exactly-once semantics, sending data multiple times
> would typically be bad.  But we'd need to gracefully handle things like API
> request errors / service outages.
> 
> - We currently use Storm for doing stream processing, but the long running
> time-windows and potentially large amount of data stored in memory make me
> a bit nervous to use it for this.
> 
> Thoughts?  Thanks in Advance!
> Stephen

Reply via email to