So say I want to calculate top K users visiting a page in the past 2 hours updated every 5 mins.
so here I want to maintain something like this Page_01 => {user_01:32, user_02:3, user_03:7...} ... Basically a count of number of times a user visited a page. Here my key is page name/id and state is the hashmap. Now in updateStateByKey I get the previous state and new events coming *in* the window. Is there a way to also get the events going *out* of the window? This was I can incrementally update the state over a rolling window. What is the efficient way to do it in spark streaming? Thanks Ashish