Thanks.

This would be useful in place where we use a key-value store just to
duplicate a KTable for get() operations.
Any rough idea when this is targeted for release?

Its still not clear how to use this for the case this thread was started
for.
Does Kafka Stream keep windows alive forever?
At some point we need to "complete" a window rt? Either based on processing
time or event time + watermark, etc.
How can we tie internal state store query with window completion? i.e, get
the final value.

Srikanth

On Thu, Jul 7, 2016 at 2:05 PM, Eno Thereska <eno.there...@gmail.com> wrote:

> Hi Srikanth, Clive,
>
> Today we just added some example code usage in the KIP after feedback from
> the community. There is code that shows how to access a WindowStore (in
> read-only mode).
>
> Thanks
> Eno
>
>
> > On 7 Jul 2016, at 15:57, Srikanth <srikanth...@gmail.com> wrote:
> >
> > Eno,
> >
> > I was also looking for something similar. To output aggregate value once
> > the window is "complete".
> > I'm not sure getting individual update for an aggregate operator is that
> > useful.
> >
> > With KIP-67, will we have access to Windowed[key]( key + timestamp) and
> > value?
> > Does until() clear this store when time passes?
> >
> > Srikanth
> >
> > On Thu, Jun 30, 2016 at 4:27 AM, Clive Cox <clivej...@yahoo.co.uk.invalid
> >
> > wrote:
> >
> >> Hi Eno,
> >> I've looked at KIP-67. It looks good but its not clear what calls I
> would
> >> make to do what I presently need: Get access to each windowed store at
> some
> >> time soon after window end time. I can then use the methods specified to
> >> iterate over keys and values. Can you point me to the relevant
> >> method/technique for this?
> >>
> >> Thanks,
> >> Clive
> >>
> >>
> >>    On Tuesday, 28 June 2016, 12:47, Eno Thereska <
> eno.there...@gmail.com>
> >> wrote:
> >>
> >>
> >> Hi Clive,
> >>
> >> As promised, here is the link to the KIP that just went out today.
> >> Feedback welcome:
> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-67%3A+Queryable+state+for+Kafka+Streams
> >> <
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-67:+Queryable+state+for+Kafka+Streams
> >>>
> >>
> >> Thanks
> >> Eno
> >>
> >>> On 27 Jun 2016, at 20:56, Eno Thereska <eno.there...@gmail.com> wrote:
> >>>
> >>> Hi Clive,
> >>>
> >>> We are working on exposing the state store behind a KTable as part of
> >> allowing for queries to the structures currently hidden behind the
> language
> >> (DSL). The KIP should be out today or tomorrow for you to have a look.
> You
> >> can probably do what you need using the low-level processor API but then
> >> you'd lose the benefits of the DSL and would have to maintain your own
> >> structures.
> >>>
> >>> Thanks,
> >>> Eno
> >>>
> >>>> On 26 Jun 2016, at 18:42, Clive Cox <clivej...@yahoo.co.uk.INVALID>
> >> wrote:
> >>>>
> >>>> Following on from this thread, if I want to iterate over a KTable at
> >> the end of its hopping/tumbling Time Window how can I do this at present
> >> myself? Is there a way to access these structures?
> >>>> If this is not possible it would seem I need to duplicate and manage
> >> something similar to a list of windowed KTables myself which is not
> really
> >> ideal.
> >>>> Thanks for any help,
> >>>> Clive
> >>>>
> >>>>
> >>>> On Monday, 13 June 2016, 16:03, Eno Thereska <eno.there...@gmail.com>
> >> wrote:
> >>>>
> >>>>
> >>>> Hi Clive,
> >>>>
> >>>> For now this optimisation is not present. We're working on it as part
> >> of KIP-63. One manual work-around might be to use a simple Key-value
> store
> >> to deduplicate the final output before sending to the backend. It could
> >> have a simple policy like "output all values at 1 second intervals" or
> >> "output after 10 records have been received".
> >>>>
> >>>> Eno
> >>>>
> >>>>
> >>>>> On 13 Jun 2016, at 13:36, Clive Cox <clivej...@yahoo.co.uk.INVALID>
> >> wrote:
> >>>>>
> >>>>>
> >>>>> Thanks Eno for your comments and references.
> >>>>> Perhaps, I can explain what I want to achieve and maybe you can
> >> suggest the correct topology?
> >>>>> I want process a stream of events and do aggregation and send to an
> >> analytics backend (Influxdb), so that rather than sending 1000
> points/sec
> >> to the analytics backend, I send a much lower value. I'm only
> interested in
> >> using the processing time of the event so in that respect there are no
> >> "late arriving" events.I was hoping I could use a Tumbling window which
> >> when its end-time had been passed I can send the consolidated
> aggregation
> >> for that window and then throw the Window away.
> >>>>>
> >>>>> It sounds like from the references you give that this is not possible
> >> at present in Kafka Streams?
> >>>>>
> >>>>> Thanks,
> >>>>> Clive
> >>>>>
> >>>>>   On Monday, 13 June 2016, 11:32, Eno Thereska <
> >> eno.there...@gmail.com> wrote:
> >>>>>
> >>>>>
> >>>>> Hi Clive,
> >>>>>
> >>>>> The behaviour you are seeing is indeed correct (though not
> necessarily
> >> optimal in terms of performance as described in this JIRA:
> >> https://issues.apache.org/jira/browse/KAFKA-3101 <
> >> https://issues.apache.org/jira/browse/KAFKA-3101>)
> >>>>>
> >>>>> The key observation is that windows never close/complete. There could
> >> always be late arriving events that appear long after a window's end
> >> interval and those need to be accounted for properly. In Kafka Streams
> that
> >> means that such late arriving events continue to update the value of the
> >> window. As described in the above JIRA, some optimisations could still
> be
> >> possible (e.g., batch requests as described in KIP-63 <
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-63:+Unify+store+and+downstream+caching+in+streams
> >),
> >> however they are not implemented yet.
> >>>>>
> >>>>> So your code needs to handle each update.
> >>>>>
> >>>>> Thanks
> >>>>> Eno
> >>>>>
> >>>>>
> >>>>>
> >>>>>> On 13 Jun 2016, at 11:13, Clive Cox <clivej...@yahoo.co.uk.INVALID>
> >> wrote:
> >>>>>>
> >>>>>> Hi,
> >>>>>> I would like to process a stream with a tumbling window of 5secs,
> >> create aggregated stats for keys and push the final aggregates at the
> end
> >> of each window period to a analytics backend. I have tried doing
> something
> >> like:
> >>>>>> stream
> >>>>>>       .map
> >>>>>>       .reduceByKey(...
> >>>>>>         , TimeWindows.of("mywindow", 5000L),...)
> >>>>>>       .foreach        {            send stats
> >>>>>>         }
> >>>>>> But I get every update to the ktable in the foreach.
> >>>>>> How do I just get the final values once the TumblingWindow is
> >> complete so I can iterate over them and send to some external system?
> >>>>>> Thanks,
> >>>>>> Clive
> >>>>>> PS Using kafka_2.10-0.10.0.0
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >>
> >>
> >>
>
>

Reply via email to