Flink's sliding window didn't work well for our use case at SAP as the
checkpointing freezes with 288 sliding windows per tenant. Implementing
sliding window through tumbling window / process function reduces the
checkpointing time to few seconds. We will see how that scales with 1000 or
more tenan
Great!
Thanks for the feedback.
Cheers, Fabian
Am Mo., 19. Aug. 2019 um 17:12 Uhr schrieb Ahmad Hassan <
ahmad.has...@gmail.com>:
>
> Thank you Fabian. This works really well.
>
> Best Regards,
>
> On Fri, 16 Aug 2019 at 09:22, Fabian Hueske wrote:
>
>> Hi Ahmad,
>>
>> The ProcessFunction shou
Thank you Fabian. This works really well.
Best Regards,
On Fri, 16 Aug 2019 at 09:22, Fabian Hueske wrote:
> Hi Ahmad,
>
> The ProcessFunction should not rely on new records to come (i..e, do the
> processsing in the onElement() method) but rather register a timer every 5
> minutes and perform
Hi Ahmad,
The ProcessFunction should not rely on new records to come (i..e, do the
processsing in the onElement() method) but rather register a timer every 5
minutes and perform the processing when the timer fires in onTimer().
Essentially, you'd only collect data the data in `processElement()` an
Hi Fabian,
In this case, how do we emit tumbling window when there are no events?
Otherwise it’s not possible to emulate a sliding window in process function and
move the buffer ring every 5 mins when there are no events.
Yes I can create a periodic source function but how can it be associated
Hi Fabian,
Thank you, We will look into it now.
Best,
On Fri, 2 Aug 2019 at 12:50, Fabian Hueske wrote:
> Ok, I won't go into the implementation detail.
>
> The idea is to track all products that were observed in the last five
> minutes (i.e., unique product ids) in a five minute tumbling wind
Ok, I won't go into the implementation detail.
The idea is to track all products that were observed in the last five
minutes (i.e., unique product ids) in a five minute tumbling window.
Every five minutes, the observed products are send to a process function
that collects the data of the last 24 h
Hi Fabian,
Thanks for this detail. However, our pipeline is keeping track of list of
products seen in 24 hour with 5 min slide (288 windows).
inStream
.filter(Objects::*nonNull*)
.keyBy(*TENANT*)
.window(SlidingProcessingTimeWindows.*of*(Time.*minutes*(24), Time.*minutes*
(5)))
.trigger(TimeT
Hi Ahmad,
First of all, you need to preaggregate the data in a 5 minute tumbling
window. For example, if your aggregation function is count or sum, this is
simple.
You have a 5 min tumbling window that just emits a count or sum every 5
minutes.
The ProcessFunction then has a MapState (called
buff
Hi Fabian,
> On 4 Jul 2018, at 11:39, Fabian Hueske wrote:
>
> - Pre-aggregate records in a 5 minute Tumbling window. However,
> pre-aggregation does not work for FoldFunctions.
> - Implement the window as a custom ProcessFunction that maintains a state of
> 288 events and aggregates and ret
HI Chesnay,
Yes this is something we would eventually be doing and then maintaining the
configuration of which tenants are mapped to which flink jobs.
This would reduce the number of flinks jobs to maintain in order to support
1000s of tenants in our use case .
Thanks.
On Wed, 4 Jul 2018 at 12:
Would it be feasible for you to partition your tenants across jobs, like
for example 100 customers per job?
On 04.07.2018 12:53, Ahmad Hassan wrote:
Hi Fabian,
One job per tenant model soon becomes hard to maintain. For example
1000 tenants would require 1000 Flink and providing HA and resili
Hi Fabian,
One job per tenant model soon becomes hard to maintain. For example 1000
tenants would require 1000 Flink and providing HA and resilience for 1000
jobs is not so trivial solution.
This is why we are hoping to get single flink job handling all the tenants
through keyby tenant. However t
Hi Ahmad,
Some tricks that might help to bring down the effort per tenant if you run
one job per tenant (or key per tenant):
- Pre-aggregate records in a 5 minute Tumbling window. However,
pre-aggregation does not work for FoldFunctions.
- Implement the window as a custom ProcessFunction that mai
Hi Folks,
We are using Flink to capture various interactions of a customer with
ECommerce store i.e. product views, orders created. We run 24 hour sliding
window 5 minutes apart which makes 288 parallel windows for a single
Tenant. We implement Fold Method that has various hashmaps to update the
s
15 matches
Mail list logo