Never mind. I see it here:

http://samza.apache.org/learn/documentation/0.8/container/windowing.html

Thanks again Milinda.

- Shekar

On Fri, Jun 26, 2015 at 11:39 AM, Shekar Tippur <ctip...@gmail.com> wrote:

> Thanks Milinda.
> Is this feature available on 0.8 version of Samza?
>
> - Shekar
>
> On Fri, Jun 26, 2015 at 11:24 AM, Milinda Pathirage <mpath...@umail.iu.edu
> > wrote:
>
>> Hi Shekar,
>>
>> You can use Samza's local storage (
>>
>> http://samza.apache.org/learn/documentation/0.9/container/state-management.html
>> )
>> to keep the window state and windowing (
>> http://samza.apache.org/learn/documentation/0.9/container/windowing.html)
>> capabilities to handle the window advancement. During advancement you can
>> update the local cache (Redis in your case). AFAIK, Samza doesn't provide
>> any helpers or utilities to handle window state maintenance. You have to
>> implement it on top of local storage or if you don't won't fault tolerance
>> you can keep the state in-memory too (as long as the state fit in memory).
>>
>> Thanks
>> Milinda
>>
>> On Fri, Jun 26, 2015 at 1:53 PM, Shekar Tippur <ctip...@gmail.com> wrote:
>>
>> > Yan,
>> >
>> >
>> > *What do you mean by "a local cache"? Is it a db like MySQL, something
>> > likeRocksDB, or even just in-memory?*
>> >
>> > Local cache as in Redis
>> >
>> >
>> >
>> > *When you say "another topic", is this the topic consumed by the same
>> > Samzajob as your 5-minutes-job, or in a separate job? What is the
>> > relationbetween the topic and the application name*
>> >
>> > We dont have a 5 min job. All we have now is a stream of events coming
>> from
>> > a bunch of applications. All these land on a raw kafka topic. The stream
>> > data has application name. I want to create a job that takes incoming
>> > stream and group it by application name and count the number of events
>> we
>> > get in a 5 min sliding window.
>> >
>> > - Shekar
>> >
>> > On Fri, Jun 26, 2015 at 10:29 AM, Yan Fang <yanfang...@gmail.com>
>> wrote:
>> >
>> > > Hi Shekar,
>> > >
>> > > Need a little more clarification.
>> > >
>> > > What do you mean by "a local cache"? Is it a db like MySQL, something
>> > like
>> > > RocksDB, or even just in-memory?
>> > >
>> > > When you say "another topic", is this the topic consumed by the same
>> > Samza
>> > > job as your 5-minutes-job, or in a separate job? What is the relation
>> > > between the topic and the application name?
>> > >
>> > > Thanks,
>> > >
>> > > Fang, Yan
>> > > yanfang...@gmail.com
>> > >
>> > > On Fri, Jun 26, 2015 at 1:08 AM, Shekar Tippur <ctip...@gmail.com>
>> > wrote:
>> > >
>> > > > Hello,
>> > > > My apologies if I have raised it earlier.
>> > > > Here is the use case:
>> > > > I have a stream that is partitioned based on application name. I
>> want
>> > to
>> > > be
>> > > > able to count hte number of events happening for that particular
>> > > > application in the past 5 minutes (sliding window) and update either
>> > > > another topic or a local cache.
>> > > >
>> > > > Is this possible via 0.9 version of Samza?
>> > > > If not, what is the easiest way to achieve this?
>> > > >
>> > > > - Shekar
>> > > >
>> > >
>> >
>>
>>
>>
>> --
>> Milinda Pathirage
>>
>> PhD Student | Research Assistant
>> School of Informatics and Computing | Data to Insight Center
>> Indiana University
>>
>> twitter: milindalakmal
>> skype: milinda.pathirage
>> blog: http://milinda.pathirage.org
>>
>
>

Reply via email to