Is this KIP close to completion? Because we could start working on the code
itself now. (Its at about this stage).

On Mon, Oct 16, 2017 at 7:37 PM, Richard Yu <yohan.richard...@gmail.com>
wrote:

> As Guozhang Wang mentioned earlier, we want to mirror the structure of
> similar Store class (namely KTable). The WindowedStore class might be
> unique in itself as it uses fetch() methods, but in my opinion, uniformity
> should be better suited for simplicity.
>
> On Mon, Oct 16, 2017 at 11:54 AM, Xavier Léauté <xav...@confluent.io>
> wrote:
>
>> Thank you Richard! Do you or Guozhang have any thoughts on my suggestions
>> to use fetchAll() and fetchAll(timeFrom, timeTo) and reserve the "range"
>> keyword for when we query a specific range of keys?
>>
>> Xavier
>>
>> On Sat, Oct 14, 2017 at 2:32 PM Richard Yu <yohan.richard...@gmail.com>
>> wrote:
>>
>> > Thanks for the clarifications, Xavier.
>> > I have removed most of the methods except for keys() and all() which has
>> > been renamed to Guozhang Wang's suggestions.
>> >
>> > Hope this helps.
>> >
>> > On Fri, Oct 13, 2017 at 3:28 PM, Xavier Léauté <xav...@confluent.io>
>> > wrote:
>> >
>> > > Thanks for the KIP Richard, this is a very useful addition!
>> > >
>> > > As far as the API changes, I just have a few comments on the methods
>> that
>> > > don't seem directly related to the KIP title, and naming of course :).
>> > > On the implementation, see my notes further down that will hopefully
>> > > clarify a few things.
>> > >
>> > > Regarding the "bonus" methods:
>> > > I agree with Guozhang that the KIP lacks proper motivation for adding
>> the
>> > > min, max, and allLatest methods.
>> > > It is also not clear to me what min and max would really mean, what
>> > > ordering do we refer to here? Are we first ordering by time, then
>> key, or
>> > > first by key, then time?
>> > > The allLatest method might be useful, but I don't really see how it
>> would
>> > > be used in practice if we have to scan the entire range of keys for
>> all
>> > the
>> > > state stores, every single time.
>> > >
>> > > Maybe we could flesh the motivation behind those extra methods, but in
>> > the
>> > > interest of time, and moving the KIP forward it might make sense to
>> file
>> > a
>> > > follow-up once we have more concrete use-cases.
>> > >
>> > > On naming:
>> > > I also agree with Guozhang that "keys()" should be renamed. It feels a
>> > bit
>> > > of a misnomer, since it not only returns keys, but also the values.
>> > >
>> > > As far as what to rename it to, I would argue we already have some
>> > > discrepancy between key-value stores using range() vs. window stores
>> > using
>> > > fetch().
>> > > I assume we called the window method "fetch" instead of "get" because
>> you
>> > > might get back more than one window for the requested key.
>> > >
>> > > If we wanted to make things consistent with both existing key-value
>> store
>> > > naming and window store naming, we could do the following:
>> > > Decide that "all" always refers to the entire range of keys,
>> independent
>> > of
>> > > the window and similarly "range" always refers to a particular range
>> of
>> > > keys, irrespective of the window.
>> > > We can then prefix methods with "fetch" to indicate that more than one
>> > > window may be returned for each key in the range.
>> > >
>> > > This would give us:
>> > > - a new fetchAll() method for all the keys, which makes it clear that
>> you
>> > > might get back the same key in different windows
>> > > - a new fetchAll(timeFrom, timeTo) method to get all the keys in a
>> given
>> > > time range, again with possibly more than one window per key
>> > > - and we'd have to rename fetch(K,K,long, long) to fetchRange(K, K,
>> long,
>> > > long)  and deprecate the old one to indicate a range of keys
>> > >
>> > > One inconsistency I noted: the "Proposed Changes" section in your KIP
>> > talks
>> > > about a "range(timeFrom, timeTo)" method, I think you meant to refer
>> to
>> > the
>> > > all(from, to) method, but I'm sure you'll fix that once we decide on
>> > > naming.
>> > >
>> > > On the implementation side:
>> > > You mentioned that caching and rocksdb store have very different
>> > key/value
>> > > structures, and while it appears to be that way on the surface, the
>> > > structure between the two is actually very similar. Keys in the cache
>> are
>> > > prefixed with a segment ID to ensure the ordering in the cache stays
>> > > consistent with the rocksdb implementation, which maintains multiple
>> > > rocksdb instances, one for each segment. So we just "artificially"
>> mirror
>> > > the segment structure in the cache.
>> > >
>> > > The reason for keeping the ordering consistent is pretty simple: keep
>> in
>> > > mind that when we query a cached window store we are effectively
>> querying
>> > > both the cache and the persistent rocksdb store at the same time,
>> merging
>> > > results from both. To make that merge as painless as possible, we
>> ensure
>> > > the ordering is consistent when querying a range of keys in both
>> stores.
>> > >
>> > > Also keep in mind CompositeReadonlyWindowStore, which wraps multiple
>> > window
>> > > stores within a topology.
>> > >
>> > > Hope this clarifies some of the less trivial parts of caching window
>> > store.
>> > >
>> > > Cheers,
>> > > Xavier
>> > >
>> > > On Sun, Oct 8, 2017 at 9:21 PM Guozhang Wang <wangg...@gmail.com>
>> wrote:
>> > >
>> > > > Richard, Matthias:
>> > > >
>> > > > 0. Could you describe a bit what are the possible use cases of
>> > > `allLatest`,
>> > > > `minKey` and `maxKey`? I'd prefer keeping the APIs to add at a
>> minimum
>> > > > necessary amount, to avoid a swamp of new APIs that no one would
>> really
>> > > use
>> > > > but just complicated the internal code base.
>> > > >
>> > > > 1. One minor comment on the other two new APIs: could we rename
>> `keys`
>> > to
>> > > > `all` and `all` to `range` to be consistent with the other store's
>> > APIs?
>> > > >
>> > > > 2. One meta comment on the implementation details: since both `keys`
>> > and
>> > > > `all` would likely touch multiple segments, we may need to use the
>> > > internal
>> > > > `SegmentIterator` class, but currently it always requires a Bytes
>> from
>> > > and
>> > > > Bytes to for its constructor. What changes we need to make for that
>> > > class?
>> > > >
>> > > >
>> > > > Guozhang
>> > > >
>> > > >
>> > > > On Thu, Oct 5, 2017 at 4:42 PM, Richard Yu <
>> yohan.richard...@gmail.com
>> > >
>> > > > wrote:
>> > > >
>> > > > > We should split KAFKA-4499 into several sub-issues with 4499 being
>> > the
>> > > > > parent issue.
>> > > > > Adding the implementation to CachingWindowStore,
>> RocksDBWindowStore,
>> > > etc
>> > > > > will each require the addition of a test and implementing the
>> methods
>> > > > which
>> > > > > is not trivial.
>> > > > > This way, it should be easier to manage the progress of the KIP.
>> > > > >
>> > > > >
>> > > > > On Thu, Oct 5, 2017 at 2:58 PM, Matthias J. Sax <
>> > matth...@confluent.io
>> > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Thanks for driving this and sorry for late response. With
>> release
>> > > > > > deadline it was pretty busy lately.
>> > > > > >
>> > > > > > Can you please add a description for the suggested method, what
>> > they
>> > > > are
>> > > > > > going to return? It's a little unclear to me atm.
>> > > > > >
>> > > > > > It would also be helpful to discuss, for which use case each
>> method
>> > > is
>> > > > > > useful. This might also help to identify potential gaps for
>> which
>> > > > > > another API might be more helpful.
>> > > > > >
>> > > > > > Also, we should talk about provided guarantees when using those
>> > APIs
>> > > > > > with regard to consistency -- not saying that we need to provide
>> > > strong
>> > > > > > guarantees, but he KIP should describe what user can expect.
>> > > > > >
>> > > > > >
>> > > > > > -Matthias
>> > > > > >
>> > > > > > On 9/24/17 8:11 PM, Richard Yu wrote:
>> > > > > > > Hello, I would like to solicit review and comment on this
>> issue
>> > > (link
>> > > > > > > below):
>> > > > > > >
>> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > > > > > 205%3A+Add+getAllKeys%28%29+API+to+ReadOnlyWindowStore
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > >
>> > > > --
>> > > > -- Guozhang
>> > > >
>> > >
>> >
>>
>
>

Reply via email to