Hi Folks,

We are using Flink to capture various interactions of a customer with
ECommerce store i.e. product views, orders created. We run 24 hour sliding
window 5 minutes apart which makes 288 parallel windows for a single
Tenant. We implement Fold Method that has various hashmaps to update the
statistics of customers from the incoming Ecommerce event one by one. As
soon as the event arrives, the fold method updates the statistics in
hashmaps.

Considering 1000 Tenants, we have two solutions in mind:

!) Implement a flink job per tenant. So 1000 tenants would create 1000
flink jobs

2) Implement a single flink with keyBy 'tenant' so that each tenant gets a
separate window. But this will end up in creating 1000 * 288 number of
windows in 24 hour period. This would cause extra load on single flink job.

What is recommended approach to handle multitenancy in flink at such a big
scale with over 1000 tenants while storing the fold state for each event.
Solution I would require significant effort to keep track of 1000 flink
jobs and provide resilience.

Thanks.

Best Regards,

Reply via email to