Never mind. I see it here: http://samza.apache.org/learn/documentation/0.8/container/windowing.html
Thanks again Milinda. - Shekar On Fri, Jun 26, 2015 at 11:39 AM, Shekar Tippur <ctip...@gmail.com> wrote: > Thanks Milinda. > Is this feature available on 0.8 version of Samza? > > - Shekar > > On Fri, Jun 26, 2015 at 11:24 AM, Milinda Pathirage <mpath...@umail.iu.edu > > wrote: > >> Hi Shekar, >> >> You can use Samza's local storage ( >> >> http://samza.apache.org/learn/documentation/0.9/container/state-management.html >> ) >> to keep the window state and windowing ( >> http://samza.apache.org/learn/documentation/0.9/container/windowing.html) >> capabilities to handle the window advancement. During advancement you can >> update the local cache (Redis in your case). AFAIK, Samza doesn't provide >> any helpers or utilities to handle window state maintenance. You have to >> implement it on top of local storage or if you don't won't fault tolerance >> you can keep the state in-memory too (as long as the state fit in memory). >> >> Thanks >> Milinda >> >> On Fri, Jun 26, 2015 at 1:53 PM, Shekar Tippur <ctip...@gmail.com> wrote: >> >> > Yan, >> > >> > >> > *What do you mean by "a local cache"? Is it a db like MySQL, something >> > likeRocksDB, or even just in-memory?* >> > >> > Local cache as in Redis >> > >> > >> > >> > *When you say "another topic", is this the topic consumed by the same >> > Samzajob as your 5-minutes-job, or in a separate job? What is the >> > relationbetween the topic and the application name* >> > >> > We dont have a 5 min job. All we have now is a stream of events coming >> from >> > a bunch of applications. All these land on a raw kafka topic. The stream >> > data has application name. I want to create a job that takes incoming >> > stream and group it by application name and count the number of events >> we >> > get in a 5 min sliding window. >> > >> > - Shekar >> > >> > On Fri, Jun 26, 2015 at 10:29 AM, Yan Fang <yanfang...@gmail.com> >> wrote: >> > >> > > Hi Shekar, >> > > >> > > Need a little more clarification. >> > > >> > > What do you mean by "a local cache"? Is it a db like MySQL, something >> > like >> > > RocksDB, or even just in-memory? >> > > >> > > When you say "another topic", is this the topic consumed by the same >> > Samza >> > > job as your 5-minutes-job, or in a separate job? What is the relation >> > > between the topic and the application name? >> > > >> > > Thanks, >> > > >> > > Fang, Yan >> > > yanfang...@gmail.com >> > > >> > > On Fri, Jun 26, 2015 at 1:08 AM, Shekar Tippur <ctip...@gmail.com> >> > wrote: >> > > >> > > > Hello, >> > > > My apologies if I have raised it earlier. >> > > > Here is the use case: >> > > > I have a stream that is partitioned based on application name. I >> want >> > to >> > > be >> > > > able to count hte number of events happening for that particular >> > > > application in the past 5 minutes (sliding window) and update either >> > > > another topic or a local cache. >> > > > >> > > > Is this possible via 0.9 version of Samza? >> > > > If not, what is the easiest way to achieve this? >> > > > >> > > > - Shekar >> > > > >> > > >> > >> >> >> >> -- >> Milinda Pathirage >> >> PhD Student | Research Assistant >> School of Informatics and Computing | Data to Insight Center >> Indiana University >> >> twitter: milindalakmal >> skype: milinda.pathirage >> blog: http://milinda.pathirage.org >> > >