Thanks for the note Michael, I think it makes sense. And thinking about it more, I also agree that even for single-key fetch on ReadOnlyWindow/SessionStore it's better to return the Iterator<Windowed<K>, V> even when the windowed key's key field would always be the same, as it is more consistent with the added APIs. But we need to do it in a backward compatible way, which we can discuss in a follow-up KIP.
Guozhang On Thu, May 11, 2017 at 10:38 AM, Xavier Léauté <xav...@confluent.io> wrote: > Thanks Michal, you are correct. I can see your point now, and I can get > behind returning Windowed<K> as well for windowed stores. > > It might make sense to revisit the single key iterator in the future and do > the same for consistency, but I'd rather not break backwards compatibility > unless we have a good reason to do so. > > Everyone else who already voted on the KIP, are there any objections to > this change? I have updated the KIP accordingly. > > Thank you, > Xavier > > On Thu, May 11, 2017 at 12:44 AM Michal Borowiecki < > michal.borowie...@openbet.com> wrote: > > > Also, wrt > > > > > In the case of the window store, the "key" of the single-key iterator > is > > > the actual timestamp of the underlying entry, not just range of the > > > window, > > > so if we were to wrap the result key a window we wouldn't be getting > back > > > the equivalent of the single key iterator. > > I believe the timestamp in the entry *is* the window start time (the end > > time is implicitly known by adding the window size to the window start > > time) > > > > > > https://github.com/apache/kafka/blob/trunk/streams/src/ > main/java/org/apache/kafka/streams/kstream/internals/ > KStreamWindowReduce.java#L109 > > > > > > > > https://github.com/apache/kafka/blob/trunk/streams/src/ > main/java/org/apache/kafka/streams/kstream/internals/ > KStreamWindowAggregate.java#L111 > > > > Both use window.start() as the timestamp when storing into the > WindowStore. > > > > Or am I confusing something horribly here? Hope not ;-) > > > > > > If the above is correct, then using KeyValueIterator<Windowed<K>, V> as > > the return type of the new fetch method would indeed not lose anything > > the single key iterator is offering. > > > > The window end time could simply be calculated as window start time + > > window size (window size would have to be passed from the store supplier > > to the store implementation, which I think it isn't now but that's an > > implementation detail). > > > > If you take objection to exposing the window end time because the single > > key iterator doesn't do that, then an alternative could also be to have > > the return type of the new fetch be something like > > KeyValueItarator<Tuple2<K, Long>, V>, since the key is composed of the > > actual key and the timestamp together. peakNextKey() would then allow > > you to glimpse both the actual key and the associated window start time. > > This feels like a better workaround then putting the KeyValue pair in > > the V of the WindowStoreIterator<V>. > > > > All-in-all, I'd still prefer KeyValueIterator<Windowed<K>, V> as it more > > clearly names what's what. > > > > What do you think? > > > > Thanks, > > > > Michal > > > > On 11/05/17 07:51, Michal Borowiecki wrote: > > > Well, another concern, apart from potential confusion, is that you > > > won't be able to peek the actual next key, just the timestamp. So the > > > tradeoff is between having consistency in return types versus > > > consistency in having the ability to know the next key without moving > > > the iterator. To me the latter just feels more important. > > > > > > Thanks, > > > Michal > > > On 11 May 2017 12:46 a.m., Xavier Léauté <xav...@confluent.io> wrote: > > > > > > Thank you for the feedback Michal. > > > > > > While I agree the return may be a little bit more confusing to > reason > > > about, the reason for doing so was to keep the range query > interfaces > > > consistent with their single-key counterparts. > > > > > > In the case of the window store, the "key" of the single-key > > > iterator is > > > the actual timestamp of the underlying entry, not just range of > > > the window, > > > so if we were to wrap the result key a window we wouldn't be > > > getting back > > > the equivalent of the single key iterator. > > > > > > In both cases peekNextKey is just returning the timestamp of the > > > next entry > > > in the window store that matches the query. > > > > > > In the case of the session store, we already return Windowed<K> > > > for the > > > single-key method, so it made sense there to also return > > > Windowed<K> for > > > the range method. > > > > > > Hope this make sense? Let me know if you still have concerns about > > > this. > > > > > > Thank you, > > > Xavier > > > > > > On Wed, May 10, 2017 at 12:25 PM Michal Borowiecki < > > > michal.borowie...@openbet.com> wrote: > > > > > > > Apologies, I missed the discussion (or lack thereof) about the > > > return > > > > type of: > > > > > > > > WindowStoreIterator<KeyValue<K, V>> fetch(K from, K to, long > > > timeFrom, > > > > long timeTo) > > > > > > > > > > > > WindowStoreIterator<V> (as the KIP mentions) is a subclass of > > > > KeyValueIterator<Long, V> > > > > > > > > KeyValueIterator<K,V> has the following method: > > > > > > > > /** * Peek at the next key without advancing the iterator * > > > @return the > > > > key of the next value that would be returned from the next call > > > to next > > > > */ K peekNextKey(); > > > > > > > > Given the type in this case will be Long, I assume what it would > > > return > > > > is the window timestamp of the next found record? > > > > > > > > > > > > In the case of WindowStoreIterator<V> fetch(K key, long > > > timeFrom, long > > > > timeTo); > > > > all records found by fetch have the same key, so it's harmless > > > to return > > > > the timestamp of the next found window but here we have varying > > > keys and > > > > varying windows, so won't it be too confusing? > > > > > > > > KeyValueIterator<Windowed<K>, V> (as in the proposed > > > > ReadOnlySessionStore.fetch) just feels much more intuitive. > > > > > > > > Apologies again for jumping onto this only once the voting has > > > already > > > > begun. > > > > Thanks, > > > > Michał > > > > > > > > On 10/05/17 20:08, Sriram Subramanian wrote: > > > > > +1 > > > > > > > > > > On Wed, May 10, 2017 at 11:42 AM, Bill Bejeck > > > <bbej...@gmail.com> wrote: > > > > > > > > > >> +1 > > > > >> > > > > >> Thanks, > > > > >> Bill > > > > >> > > > > >> On Wed, May 10, 2017 at 2:38 PM, Guozhang Wang > > > <wangg...@gmail.com> > > > > wrote: > > > > >> > > > > >>> +1. Thank you! > > > > >>> > > > > >>> On Wed, May 10, 2017 at 11:30 AM, Xavier Léauté > > > <xav...@confluent.io> > > > > >>> wrote: > > > > >>> > > > > >>>> Hi everyone, > > > > >>>> > > > > >>>> Since there aren't any objections to this addition, I would > > > like to > > > > >> start > > > > >>>> the voting on KIP-155 so we can hopefully get this into > 0.11. > > > > >>>> > > > > >>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP+ > > > > >>>> 155+-+Add+range+scan+for+windowed+state+stores > > > > >>>> > > > > >>>> Voting will stay active for at least 72 hours. > > > > >>>> > > > > >>>> Thank you, > > > > >>>> Xavier > > > > >>>> > > > > >>> > > > > >>> > > > > >>> -- > > > > >>> -- Guozhang > > > > >>> > > > > > > > > > > > > > > > > -- -- Guozhang