How would you start implementing it? Where are you stuck?

Did you already try to implement this?

> On 18. Mar 2018, at 04:10, Dhruv Kumar <gargdhru...@gmail.com> wrote:
> 
> Hi
> 
> I am a CS PhD student at UMN Twin Cities. I am trying to use Flink for 
> implementing some very specific use-cases: (They may not seem relevant but I 
> need to implement them or I at least need to know if it is possible to 
> implement them in Flink)
> 
> Assumptions:
> 1. Data stream is of the form (key, value). We achieve this by the .key 
> operation provided by Flink API.
> 2. By emitting a key, I mean sending/outputting its aggregated value to any 
> data sink. 
> 
> 1. For each Tumbling window in the Event Time space, for each key, I would 
> like to aggregate its value until it crosses a particular threshold (same 
> threshold for all the keys). As soon as the key’s aggregated value crosses 
> this threshold, I would like to emit this key. At the end of every tumbling 
> window, all the (key, value) aggregated pairs  would be emitted irrespective 
> of whether they have crossed the threshold or not.
> 
> 2. For each Tumbling window in the event time space, I would like to maintain 
> a LRU cache which stores the keys along with their aggregated values and 
> their latest arrival time. The least recently used (LRU) key would be the key 
> whose latest arrival time is earlier than the latest arrival times of all the 
> other keys present in the LRU cache. The LRU cache is of a limited size. So, 
> it is possible that the number of unique keys in a particular window is 
> greater than the size of LRU cache. Whenever any (key, value) pair arrives, 
> if the key already exists, its aggregated value is updated with the value of 
> the newly arrived value and its latest arrival time is updated with the 
> current event time. If the key does not exist and there is some free slot in 
> the LRU cache, it is added into the LRU. As soon as the LRU cache gets 
> occupied fully and a new key comes in which does not exist in the LRU cache, 
> we would like to emit the least recently used key to accommodate the newly 
> arrived key. As in the case of 1, at the end of every tumbling window, all 
> the (key, value) aggregated pairs in the LRU cache would be emitted.  
> 
> Would like to know how can we implement these algorithms using Flink. Any 
> help would be greatly appreciated.
> 
> Dhruv Kumar
> PhD Candidate
> Department of Computer Science and Engineering
> University of Minnesota
> www.dhruvkumar.me
> 

Reply via email to