Hi Srikanth, Clive, Today we just added some example code usage in the KIP after feedback from the community. There is code that shows how to access a WindowStore (in read-only mode).
Thanks Eno > On 7 Jul 2016, at 15:57, Srikanth <srikanth...@gmail.com> wrote: > > Eno, > > I was also looking for something similar. To output aggregate value once > the window is "complete". > I'm not sure getting individual update for an aggregate operator is that > useful. > > With KIP-67, will we have access to Windowed[key]( key + timestamp) and > value? > Does until() clear this store when time passes? > > Srikanth > > On Thu, Jun 30, 2016 at 4:27 AM, Clive Cox <clivej...@yahoo.co.uk.invalid> > wrote: > >> Hi Eno, >> I've looked at KIP-67. It looks good but its not clear what calls I would >> make to do what I presently need: Get access to each windowed store at some >> time soon after window end time. I can then use the methods specified to >> iterate over keys and values. Can you point me to the relevant >> method/technique for this? >> >> Thanks, >> Clive >> >> >> On Tuesday, 28 June 2016, 12:47, Eno Thereska <eno.there...@gmail.com> >> wrote: >> >> >> Hi Clive, >> >> As promised, here is the link to the KIP that just went out today. >> Feedback welcome: >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-67%3A+Queryable+state+for+Kafka+Streams >> < >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-67:+Queryable+state+for+Kafka+Streams >>> >> >> Thanks >> Eno >> >>> On 27 Jun 2016, at 20:56, Eno Thereska <eno.there...@gmail.com> wrote: >>> >>> Hi Clive, >>> >>> We are working on exposing the state store behind a KTable as part of >> allowing for queries to the structures currently hidden behind the language >> (DSL). The KIP should be out today or tomorrow for you to have a look. You >> can probably do what you need using the low-level processor API but then >> you'd lose the benefits of the DSL and would have to maintain your own >> structures. >>> >>> Thanks, >>> Eno >>> >>>> On 26 Jun 2016, at 18:42, Clive Cox <clivej...@yahoo.co.uk.INVALID> >> wrote: >>>> >>>> Following on from this thread, if I want to iterate over a KTable at >> the end of its hopping/tumbling Time Window how can I do this at present >> myself? Is there a way to access these structures? >>>> If this is not possible it would seem I need to duplicate and manage >> something similar to a list of windowed KTables myself which is not really >> ideal. >>>> Thanks for any help, >>>> Clive >>>> >>>> >>>> On Monday, 13 June 2016, 16:03, Eno Thereska <eno.there...@gmail.com> >> wrote: >>>> >>>> >>>> Hi Clive, >>>> >>>> For now this optimisation is not present. We're working on it as part >> of KIP-63. One manual work-around might be to use a simple Key-value store >> to deduplicate the final output before sending to the backend. It could >> have a simple policy like "output all values at 1 second intervals" or >> "output after 10 records have been received". >>>> >>>> Eno >>>> >>>> >>>>> On 13 Jun 2016, at 13:36, Clive Cox <clivej...@yahoo.co.uk.INVALID> >> wrote: >>>>> >>>>> >>>>> Thanks Eno for your comments and references. >>>>> Perhaps, I can explain what I want to achieve and maybe you can >> suggest the correct topology? >>>>> I want process a stream of events and do aggregation and send to an >> analytics backend (Influxdb), so that rather than sending 1000 points/sec >> to the analytics backend, I send a much lower value. I'm only interested in >> using the processing time of the event so in that respect there are no >> "late arriving" events.I was hoping I could use a Tumbling window which >> when its end-time had been passed I can send the consolidated aggregation >> for that window and then throw the Window away. >>>>> >>>>> It sounds like from the references you give that this is not possible >> at present in Kafka Streams? >>>>> >>>>> Thanks, >>>>> Clive >>>>> >>>>> On Monday, 13 June 2016, 11:32, Eno Thereska < >> eno.there...@gmail.com> wrote: >>>>> >>>>> >>>>> Hi Clive, >>>>> >>>>> The behaviour you are seeing is indeed correct (though not necessarily >> optimal in terms of performance as described in this JIRA: >> https://issues.apache.org/jira/browse/KAFKA-3101 < >> https://issues.apache.org/jira/browse/KAFKA-3101>) >>>>> >>>>> The key observation is that windows never close/complete. There could >> always be late arriving events that appear long after a window's end >> interval and those need to be accounted for properly. In Kafka Streams that >> means that such late arriving events continue to update the value of the >> window. As described in the above JIRA, some optimisations could still be >> possible (e.g., batch requests as described in KIP-63 < >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-63:+Unify+store+and+downstream+caching+in+streams>), >> however they are not implemented yet. >>>>> >>>>> So your code needs to handle each update. >>>>> >>>>> Thanks >>>>> Eno >>>>> >>>>> >>>>> >>>>>> On 13 Jun 2016, at 11:13, Clive Cox <clivej...@yahoo.co.uk.INVALID> >> wrote: >>>>>> >>>>>> Hi, >>>>>> I would like to process a stream with a tumbling window of 5secs, >> create aggregated stats for keys and push the final aggregates at the end >> of each window period to a analytics backend. I have tried doing something >> like: >>>>>> stream >>>>>> .map >>>>>> .reduceByKey(... >>>>>> , TimeWindows.of("mywindow", 5000L),...) >>>>>> .foreach { send stats >>>>>> } >>>>>> But I get every update to the ktable in the foreach. >>>>>> How do I just get the final values once the TumblingWindow is >> complete so I can iterate over them and send to some external system? >>>>>> Thanks, >>>>>> Clive >>>>>> PS Using kafka_2.10-0.10.0.0 >>>>>> >>>>> >>>>> >>>> >>>> >>> >> >> >> >>