Thanks Victoria, I guess an advantage of exposing a method like delete(key, timestamp) could be that from a user's standpoint, it is a single operation and not 2. The equivalent of this method i.e put followed by get is not atomic so exposing it certainly sounds like a good idea.
Thanks! Sagar. On Tue, Nov 29, 2022 at 1:15 AM Victoria Xia <victoria....@confluent.io.invalid> wrote: > Thanks, Sagar and Bruno, for your insights and comments! > > > Sagar: Can we name according to the semantics that you want to > support like `getAsOf` or something like that? I am not sure if we do that > in our codebase though. Maybe the experts can chime in. > > Because it is a new method that will be added, we should be able to name it > whatever we like. I agree `getAsOf` is more clear, albeit wordier. > Introducing `getAsOf(key, timestamp)` means we could leave open `get(key, > timeFrom, timeTo)` to have an exclusive `timeTo` without introducing a > collision. (We could introduce `getBetween(key, timeFrom, timeTo)` instead > to delineate even more clearly, though this is better left for a future > KIP.) > > I don't think there's any existing precedent in codebase to follow here but > I'll leave that to the experts. Curious to hear what others prefer as well. > > > Sagar: With delete, we would stlll keep the older versions of the key > right? > > We could certainly choose this for the semantics of delete(...) -- and it > sounds like we should too, based on Bruno's confirmation below that this > feels more natural to him as well -- but as Bruno noted in his message > below I think we'll want the method signature to be `delete(key, > timestamp)` then, so that there is an explicit timestamp to associate with > the deletion. In other words, `delete(key, timestamp)` has the same effect > as `put(key, null, timestamp)`. The only difference is that the `put(...)` > method has a `void` return type, while `delete(key, timestamp)` can have > `ValueAndTimestamp` as return type in order to return the record which is > replaced (if any). In other words, `delete(key, timestamp)` is equivalent > to `put(key, null, timestamp)` followed by `get(key, timestamp)`. > > > Bruno: I would also not change the semantics so that it deletes all > versions of > a key. I would rather add a new method purge(key) or > deleteAllVersions(key) or similar if we want to have such a method in > this first KIP. > > Makes sense; I'm convinced. Let's defer > `purge(key)`/`deleteAllVersions(key)` to a future KIP. If there's agreement > that `delete(key, timestamp)` (as described above) is valuable, we can keep > it in this first KIP even though it is syntactic sugar. If this turns into > a larger discussion, we can defer this to a future KIP as well. > > > Bruno: I would treat the history retention as a strict limit. [...] You > could also add historyRetentionMs() to the VersionedKeyValueStore<K, V> > interface to make the concept of the history retention part of the > interface. > > OK. That's the second vote for rewording the javadoc for > `VersionedKeyValueStore#get(key, timestampTo)` to remove the parenthetical > and clarify that history retention should be used to dictate this case, so > I'll go ahead and do that. I'll leave out adding `historyRetentionMs()` to > the interface for now, though, for the sake of consistency with other > stores (e.g., window stores) which don't expose similar types of > configurations from their interfaces. > > > Bruno: exclusive vs inclusive regarding validTo timestamp in get(). > Doesn't this decision depend on the semantics of the join for which this > state store should be used? > > Yes, you are correct. As a user I would expect that a stream-side record > with the same timestamp as a table-side record _would_ produce a join > result, which is consistent with the proposal for timestampTo to be > inclusive. (FWIW I tried this out with a Flink temporal join just now and > observed this result as well. Not sure where to look for other standards to > validate this expectation.) > > > Bruno: If Streams does not update min.compaction.lag.ms during > rebalances, > users have to do it each time they change history retention in the code, > right? That seems odd to me. What is the actual reason for not updating > the config? How does Streams handle updates to windowed stores? > > Yes, users will have to update min.compaction.lag.ms for the changelog > topic themselves if they update history retention in their code. This is > consistent with what happens for window stores today: e.g., if a user > updates grace period for a windowed aggregation, then they are responsible > for updating retention.ms on their windowed changelog topic as well. > > I'm not familiar with the historical context around why this is the case -- > Matthias, do you know? > > My best guess is that Streams does not want to interfere with any potential > out-of-band changes by the user between application restarts, though I'm > not sure why a user would want to change this specific config to a value > which does not accord with the specified history retention. I notice that > there is code for validating topic configs and collecting validation errors > ( > > https://github.com/apache/kafka/blob/be032735b39360df1a6de1a7feea8b4336e5bcc0/streams/src/main/java/org/apache/kafka/streams/processor/internals/InternalTopicManager.java#L318-L319 > ) > but this method is not called from anywhere, even though there are unit > tests for it. I was unable to find history of this validation after a quick > search. Hopefully Matthias (or others) has context, otherwise I will have a > closer look. > > - Victoria > > On Wed, Nov 23, 2022 at 8:52 AM Bruno Cadonna <cado...@apache.org> wrote: > > > Hi all, > > > > Thanks for the KIP, Victoria! > > > > I have a couple of comments. > > > > 1. delete(key) > > I think delete(key) should not remove all versions of a key. We should > > use it to close the validity interval of the last version. > > Assuming we have records of different versions for key A: > > (A, e, 0, 2), > > (A, f, 2, 3), > > (A, g, 3, MAX) > > > > delete(A) would update them to > > > > (A, e, 0, 2), > > (A, f, 2, 3), > > (A, g, 3, 5) > > (A, null, 5, MAX) > > > > But then the question arises where does timestamp 5 that closes the > > interval in (A, g, 3, 5) and opens the interval in (A, null, 5, MAX) > > come from. We could use the timestamp at which delete(A) is called, but > > actually I do not like that because it seems to me it opens the doors to > > non-determinism. If we use event time for put() we should also use it > > for delete(). Actually, put(A, null, 5) would have the same effect as > > delete(A) in the example above. As a syntactical sugar, we could add > > delete(key, validFrom). (I just realized now that I just repeated what > > Victoria said in her previous e-mail.) > > I agree with Victoria that delete(A) as defined for other state stores > > is hard to re-use in the versioned key-value store. > > I would also not change the semantics so that it deletes all versions of > > a key. I would rather add a new method purge(key) or > > deleteAllVersions(key) or similar if we want to have such a method in > > this first KIP. > > > > > > 2. history retention > > I would remove "(up to store implementation discretion when this is the > > case)". I would treat the history retention as a strict limit. If users > > want to implement a less strict behavior, they can still do it. Maybe > > mention in the javadocs the implications of not adhering strictly to the > > history retention. That is, the DSL might become non-deterministic. You > > could also add historyRetentionMs() to the VersionedKeyValueStore<K, V> > > interface to make the concept of the history retention part of the > > interface. > > > > 3. null vs. exception for out-of-bound queries > > I am in favor of null. The record version is not there anymore because > > it expired. This seems to me normal and nothing exceptional. That would > > also consistent with the behavior of other APIs as already mentioned. > > > > > > 4. Exposing segmentInterval > > Since we have evidence that the segment interval affects performance, I > > would expose it. But I find it also OK to expose it once we have a > > corresponding metric. > > > > 5. exclusive vs inclusive regarding validTo timestamp in get() > > Doesn't this decision depend on the semantics of the join for which this > > state store should be used? Should a record on the table side that has > > the same timestamp as the record on the stream side join? Or should only > > records in the table that are strictly before the record on the stream > > side join? > > > > > > 6. Not setting min.compaction.lag.ms during rebalances > > If Streams does not update min.compaction.lag.ms during rebalances, > > users have to do it each time they change history retention in the code, > > right? That seems odd to me. What is the actual reason for not updating > > the config? How does Streams handle updates to windowed stores? That > > should be a similar situation for the retention time config of the > > changelog topic. > > > > > > Best, > > Bruno > > > > > > > > On 23.11.22 09:11, Sagar wrote: > > > Hi Vicky, > > > > > > Thanks for your response! > > > > > > I would just use numbers to refer to your comments. > > > > > > 1) Thanks for your response. Even I am not totally sure whether these > > > should be supported via IQv2 or via store interface. That said, I > > wouldn't > > > definitely qualify this as blocking the KIP for sure so we can live > > > without it :) > > > > > > 2) Yeah if the 2 APIs for get have different semantics for timestampTo, > > > then it could be confusing. I went through the link for temporal tables > > > (TFS!) and I now get why the AS OF semantics would have it inclusive. I > > > think part of the problem is that the name get on it's own is not as > > > expressive as SQL. Can we name according to the semantics that you want > > to > > > support like `getAsOf` or something like that? I am not sure if we do > > that > > > in our codebase though. Maybe the experts can chime in. > > > > > > 3) hmm I would have named it `validUpto` But again not very picky about > > it. > > > After going through the link and your KIP, it's a lot clearer to me. > > > > > > 4) I think delete(key) should be sufficient. With delete, we would > > > stlll keep the older versions of the key right? > > > > > > Thanks! > > > Sagar. > > > > > > On Wed, Nov 23, 2022 at 12:17 AM Victoria Xia > > > <victoria....@confluent.io.invalid> wrote: > > > > > >> Thanks, Matthias and Sagar, for your comments! I've responded here for > > now, > > >> and will update the KIP afterwards with the outcome of our discussions > > as > > >> they resolve. > > >> > > >> ----------- Matthias's comments ----------- > > >> > > >>> (1) Why does the new store not extend KeyValueStore, but StateStore? > > >> In the end, it's a KeyValueStore? > > >> > > >> A `VersionedKeyValueStore<K, V>` is not a `KeyValueStore<K, V>` > because > > >> many of the KeyValueStore methods would not make sense for a versioned > > >> store. For example, `put(K key, V value)` is not meaningful for a > > versioned > > >> store because the record needs a timestamp associated with it. > > >> > > >> A `VersionedKeyValueStore<K, V>` is more similar to a > `KeyValueStore<K, > > >> ValueAndTimestamp<V>>` (i.e., `TimestampedKeyValueStore<K, V>`), but > > some > > >> of the TimestampedKeyValueStore methods are still problematic. For > > example, > > >> what does it mean for `delete(K key)` to have return type > > >> `ValueAndTimestamp<V>`? Does this mean that `delete(K key)` only > deletes > > >> (and returns) the latest record version for the key? Probably we want > a > > >> versioned store to have `delete(K key)` delete all record versions for > > the > > >> given key, in which case the return type is better suited as an > > >> iterator/collection of KeyValueTimestamp. `putIfAbsent(K key, > > >> ValueAndTimestamp value)` also has ambiguous semantics for versioned > > stores > > >> (i.e., what does it mean for the key/record to be "absent"). > > >> > > >> I agree that conceptually a versioned key-value store is just a > > key-value > > >> store, though. In the future if we redesign the store interfaces, it'd > > be > > >> great to unify them by having a more generic KeyValueStore interface > > that > > >> allows for extra flexibility to support different types of key-value > > >> stores, including versioned stores. (Or, if you can think of a way to > > >> achieve this with the existing interfaces today, I'm all ears!) > > >> > > >>> (2) Should we have a ReadOnlyVersionedKeyValueStore? Even if we don't > > >> want to support IQ in this KIP, it might be good to add this interface > > >> right away to avoid complications for follow up KIPs? Or won't there > by > > >> any complications anyway? > > >> > > >> I don't think there will be complications for refactoring to add this > > >> interface in the future. Refactoring out > ReadOnlyVersionedKeyValueStore > > >> from VersionedKeyValueStore would leave VersionedKeyValueStore > unchanged > > >> from the outside. > > >> > > >> Also, is it true that the ReadOnlyKeyValueStore interface is only used > > for > > >> IQv1 and not IQv2? I think it's an open question as to whether we > should > > >> support IQv1 for versioned stores or only IQv2. If the latter, then > > maybe > > >> we won't need the extra interface at all. > > >> > > >>> (3) Why do we not have a `delete(key)` method? I am ok with not > > >> supporting all methods from existing KV-store, but a `delete(key)` > seems > > >> to be fundamentally to have? > > >> > > >> What do you think the semantics of `delete(key)` should be for > versioned > > >> stores? Should `delete(key)` delete (and return) all record versions > for > > >> the key? Or should we have `delete(key, timestamp)` which is > equivalent > > to > > >> `put(key, null, timestamp)` except with a return type to return > > >> ValueAndTimestamp representing the record it replaced? > > >> > > >> If we have ready alignment on what the interface and semantics for > > >> `delete(key)` should be, then adding it in this KIP sounds good. I > just > > >> didn't want the rest of the KIP to be hung up over additional > > interfaces, > > >> given that we can always add extra interfaces in the future. > > >> > > >>> (4a) Do we need `get(key)`? It seems to be the same as `get(key, > > >> MAX_VALUE)`? Maybe is good to have as syntactic sugar though? Just for > > >> my own clarification (should we add something to the JavaDocs?). > > >> > > >> Correct, it is just syntactic sugar. I will add a clarification into > the > > >> Javadocs as you've suggested. > > >> > > >>> (4b) Should we throw an exception if a user queries out-of-bound > > >> instead of returning `null` (in `get(key,ts)`)? > > >> -> You put it into "rejected alternatives", and I understand your > > >> argument. Would love to get input from others about this question > > >> though. -- It seems we also return `null` for windowed stores, so > maybe > > >> the strongest argument is to align to existing behavior? Or do we have > > >> case for which the current behavior is problematic? > > >> > > >> Sure; curious to hear what others think as well. > > >> > > >>> (4c) JavaDoc on `get(key,ts)` says: "(up to store implementation > > >> discretion when this is the case)" -> Should we make it a stricter > > >> contract such that the user can reason about it better (there is WIP > to > > >> make retention time a strict bound for windowed stores atm) > > >> -> JavaDocs on `persistentVersionedKeyValueStore` seems to > suggest a > > >> strict bound, too. > > >> > > >> Ah, great question. I think the question boils down to: do we want to > > >> require that all versioned stores (including custom user > > implementations) > > >> use "history retention" to determine when to expire old record > versions? > > >> > > >> Because the `persistentVersionedKeyValueStore(...)` method returns > > >> instances of the provided RocksDB-based versioned store > implementation, > > >> which does use history retention for this purpose, that's why we can > > very > > >> clearly say that for this store, `get(key, ts)` will return null if > the > > >> provided timestamp bound has fallen out of history retention. The > > reason I > > >> left the `VersionedKeyValueStore#get(key, ts)` Javadoc more generic > > (i.e., > > >> does not mention history retention) is because maybe a user > implementing > > >> their own custom store will choose a different expiry mechanism, e.g., > > keep > > >> the three latest versions for each key regardless of how old the > > timestamps > > >> are. > > >> > > >> If we want to require that all versioned stores use history retention > in > > >> order to determine when to expire old records, then I will certainly > > update > > >> the Javadoc to clarify. This is already a requirement for DSL users > > because > > >> the VersionedBytesStoreSupplier interface requires history retention > to > > be > > >> provided (in order for changelog topic configs to be properly set), so > > it's > > >> just a question of whether we also want to require PAPI users to use > > >> history retention too. I had a look at the existing window stores and > > >> didn't see precedent for requiring all window stores have a standard > > >> "retention time" concept for how long to keep windows, but if we want > to > > >> have a standard "history retention" concept for versioned stores we > > >> certainly can. WDYT? > > >> > > >>> (5a) Do we need to expose `segmentInterval`? For windowed-stores, we > > >> also use segments but hard-code it to two (it was exposed in earlier > > >> versions but it seems not useful, even if we would be open to expose > it > > >> again if there is user demand). > > >> > > >> If we want to leave it out of this first KIP (and potentially expose > it > > in > > >> the future), that works for me. The performance benchmarks I ran > suggest > > >> that this parameter greatly impacts store performance though and is > very > > >> workload dependent. If a user reported poor performance using > versioned > > >> stores for their workload, this is the first parameter I would want to > > >> tune. That said, metrics/observability for versioned stores (which > > would be > > >> helpful for determining how this parameter should be adjusted) have > been > > >> deferred to a follow-up KIP, so perhaps that's reason to defer > exposing > > >> this parameter as well. > > >> > > >>> (5b) JavaDocs says: "Performance degrades as more record versions for > > >> the same key are collected in a single segment. On the other hand, > > >> out-of-order writes and reads which access older segments may slow > down > > >> if there are too many segments." -- Wondering if JavaDocs should make > > >> any statements about expected performance? Seems to be an > implementation > > >> detail? > > >> > > >> I included this sentence to explain why a user might want to tune this > > >> value / help guide how to think about the parameter, but if we want to > > >> remove it entirely (per the discussion point above) then this Javadoc > > will > > >> be removed with it. > > >> > > >>> (6) validTo timestamp is "exclusive", right? Ie, if I query > > >> `get(key,ts[=validToV1])` I would get `null` or the "next" record v2 > > >> with validFromV2=ts? > > >> > > >> I actually intended for it to be inclusive (will update the KIP). Do > you > > >> think exclusive is more intuitive? The reason I had inclusive in my > > mind is > > >> because it's like a "AS OF <time>" query, which treats the time bound > as > > >> inclusive. > > >> > > >>> (7) The KIP says, that segments are stores in the same RocksDB -- for > > >> this case, how are efficient deletes handled? For windowed-store, we > can > > >> just delete a full RocksDB. > > >> > > >> The way that multiple segments are represented in the same RocksDB is > > that > > >> the RocksDB keys are prefixed with segment ID. An entire segment is > > deleted > > >> with a single `deleteRange()` call to RocksDB. > > >> > > >>> (8) Rejected alternatives: you propose to not return the validTo > > >> timestamp -- if we find it useful in the future to return it, would > > >> there be a clean path to change it accordingly? > > >> > > >> With the current proposal, there's no clean path. If we think there's > a > > >> good chance we might want to do this in the future, then we should > > update > > >> the proposed interfaces. > > >> > > >> The current proposed return type from `VersionedKeyValueStore<K, > > >> V>#get(key, tsTo)` is `ValueAndTimestamp<V>`. There's no way to add a > > >> second timestamp into `ValueAndTimestamp<V>`, which is why there's no > > clean > > >> path to include validTo timestamp in the future under the existing > > >> proposal. > > >> > > >> If we wanted to allow for including validTo timestamp in the future, > > we'd > > >> instead update the return type to be a new `VersionedRecord<V>` > object. > > >> Today a `VersionedRecord<V>` could just include `value` and > `timestamp`, > > >> and in the future we could add `validTo` (names subject to change) > into > > the > > >> `VersionedRecord` as well. (It'd look a little strange for now since > > >> VersionedRecord is the same as ValueAndTimestamp, but that seems > fine.) > > >> > > >> If we choose to do this, I think we should also update the return type > > of > > >> `VersionedKeyValueStore#get(key)` to be VersionedRecord as well, > rather > > >> than having one return TimestampAndValue while the other returns > > >> VersionedRecord. > > >> > > >> ----------- Sagar's comments ----------- > > >> > > >>> 1) Did you consider adding a method similar to : > > >> List<ValueAndTimeStamp<V>> get(K key, long from, long to)? > > >> I think this could be useful considering that this > > >> versioning scheme unlocks time travel at a key basis. WDYT? > > >> > > >> Yes, I do think this method is valuable. I think we will definitely > > want to > > >> support time-range based queries at some point (hopefully soon), and > > likely > > >> also key-range based queries (to achieve feature parity with existing > > >> key-value stores). > > >> > > >> It's not immediately clear to me whether these types of queries should > > be > > >> supported as part of the store interface or if they should only be > > >> supported via the `query(...)` method for IQv2. (It's an open question > > as > > >> to whether we should support IQv1 for versioned stores or only IQv2. A > > >> benefit of IQv2 over IQv1 is that we won't need to add individual > store > > >> methods for each type of query, including for all wrapped store > layers.) > > >> > > >> If we have clear non-IQ use cases for these methods (e.g., use cases > > within > > >> processors), then they'll need to be added as part of the store > > interface > > >> for sure. I'm leaning towards adding them as part of the store > interface > > >> but given the ambiguity here, it may be preferrable to defer to a > > follow-up > > >> KIP. OTOH, if you think the versioned store interface as proposed in > > this > > >> KIP is too bare bones to be useful, I'm open to adding it in now as > > well. > > >> > > >>> 2) I have a similar question as Matthias, about the timestampTo > > argument > > >> when doing a get. Is it inclusive or exclusive? > > >> > > >> Same answer (and follow-up question) as above. Do you think it will be > > >> confusing for `get(key, tsTo)` to use an inclusive time bound, while > > >> `get(key, tsFrom, tsTo)` would use an exclusive tsTo time bound? Maybe > > we > > >> should rename `get(key, tsFrom, tsTo)` to `getVersions(...)` or > > >> `getRange(...)` in order to avoid confusion. > > >> > > >>> 3) validFrom sounds slightly confusing to me. It is essentially the > > >> timestamp at which the record was inserted. validFrom makes it sound > > like > > >> validTo which can keep changing based on new records while *from* is > > fixed. > > >> WDYT? > > >> > > >> "It is essentially the timestamp at which the record was inserted" <-- > > Yes, > > >> that's correct. > > >> > > >> I borrowed the "validFrom/validTo" terminology from temporal tables, > > e.g., > > >> > > >> > > > https://learn.microsoft.com/en-us/sql/relational-databases/tables/temporal-tables?view=sql-server-ver16 > > >> . > > >> I don't believe the terms "validFrom" or "validTo" are currently > exposed > > >> anywhere in any of the user-facing interfaces (or Javadocs); I just > > needed > > >> a way to refer to the concepts in the KIP. Hopefully this is a > non-issue > > >> (at least for now) as a result. Do you have a suggestion for > terminology > > >> that would've been less confusing? > > >> > > >>> 4) Even I think delete api should be supported. > > >> > > >> Makes sense. It'd be to get your input on the same follow-up > questions I > > >> asked Matthias above as well :) > > >> > > >> On Tue, Nov 22, 2022 at 4:25 AM Sagar <sagarmeansoc...@gmail.com> > > wrote: > > >> > > >>> Hi Victoria, > > >>> > > >>> Thanks for the KIP. Seems like a very interesting idea! > > >>> > > >>> I have a couple of questions: > > >>> > > >>> 1) Did you consider adding a method similar to : > > >>> List<ValueAndTimeStamp<V>> get(K key, long from, long to)? > > >>> > > >>> I think this could be useful considering that this > > >>> versioning scheme unlocks time travel at a key basis. WDYT? > > >>> > > >>> 2) I have a similar question as Matthias, about the timestampTo > > argument > > >>> when doing a get. Is it inclusive or exclusive? > > >>> > > >>> 3) validFrom sounds slightly confusing to me. It is essentially the > > >>> timestamp at which the record was inserted. validFrom makes it sound > > like > > >>> validTo which can keep changing based on new records while *from* is > > >> fixed. > > >>> WDYT? > > >>> > > >>> 4) Even I think delete api should be supported. > > >>> > > >>> Thanks! > > >>> Sagar. > > >>> > > >>> On Tue, Nov 22, 2022 at 8:02 AM Matthias J. Sax <mj...@apache.org> > > >> wrote: > > >>> > > >>>> Thanks for the KIP Victoria. Very well written! > > >>>> > > >>>> > > >>>> Couple of questions (many might just require to add some more > details > > >> to > > >>>> the KIP): > > >>>> > > >>>> (1) Why does the new store not extend KeyValueStore, but > > StateStore? > > >>>> In the end, it's a KeyValueStore? > > >>>> > > >>>> (2) Should we have a ReadOnlyVersionedKeyValueStore? Even if we > > don't > > >>>> want to support IQ in this KIP, it might be good to add this > interface > > >>>> right away to avoid complications for follow up KIPs? Or won't there > > by > > >>>> any complications anyway? > > >>>> > > >>>> (3) Why do we not have a `delete(key)` method? I am ok with not > > >>>> supporting all methods from existing KV-store, but a `delete(key)` > > >> seems > > >>>> to be fundamentally to have? > > >>>> > > >>>> (4a) Do we need `get(key)`? It seems to be the same as `get(key, > > >>>> MAX_VALUE)`? Maybe is good to have as syntactic sugar though? Just > for > > >>>> my own clarification (should we add something to the JavaDocs?). > > >>>> > > >>>> (4b) Should we throw an exception if a user queries out-of-bound > > >>>> instead of returning `null` (in `get(key,ts)`)? > > >>>> -> You put it into "rejected alternatives", and I understand > your > > >>>> argument. Would love to get input from others about this question > > >>>> though. -- It seems we also return `null` for windowed stores, so > > maybe > > >>>> the strongest argument is to align to existing behavior? Or do we > have > > >>>> case for which the current behavior is problematic? > > >>>> > > >>>> (4c) JavaDoc on `get(key,ts)` says: "(up to store implementation > > >>>> discretion when this is the case)" -> Should we make it a stricter > > >>>> contract such that the user can reason about it better (there is WIP > > to > > >>>> make retention time a strict bound for windowed stores atm) > > >>>> -> JavaDocs on `persistentVersionedKeyValueStore` seems to > > suggest a > > >>>> strict bound, too. > > >>>> > > >>>> (5a) Do we need to expose `segmentInterval`? For windowed-stores, > > we > > >>>> also use segments but hard-code it to two (it was exposed in earlier > > >>>> versions but it seems not useful, even if we would be open to expose > > it > > >>>> again if there is user demand). > > >>>> > > >>>> (5b) JavaDocs says: "Performance degrades as more record versions > > for > > >>>> the same key are collected in a single segment. On the other hand, > > >>>> out-of-order writes and reads which access older segments may slow > > down > > >>>> if there are too many segments." -- Wondering if JavaDocs should > make > > >>>> any statements about expected performance? Seems to be an > > >> implementation > > >>>> detail? > > >>>> > > >>>> (6) validTo timestamp is "exclusive", right? Ie, if I query > > >>>> `get(key,ts[=validToV1])` I would get `null` or the "next" record v2 > > >>>> with validFromV2=ts? > > >>>> > > >>>> (7) The KIP says, that segments are stores in the same RocksDB -- > > for > > >>>> this case, how are efficient deletes handled? For windowed-store, we > > >> can > > >>>> just delete a full RocksDB. > > >>>> > > >>>> (8) Rejected alternatives: you propose to not return the validTo > > >>>> timestamp -- if we find it useful in the future to return it, would > > >>>> there be a clean path to change it accordingly? > > >>>> > > >>>> > > >>>> -Matthias > > >>>> > > >>>> > > >>>> On 11/16/22 9:57 PM, Victoria Xia wrote: > > >>>>> Hi everyone, > > >>>>> > > >>>>> I have a proposal for introducing versioned state stores in Kafka > > >>>> Streams. > > >>>>> Versioned state stores are similar to key-value stores except they > > >> can > > >>>>> store multiple record versions for a single key. This KIP focuses > on > > >>>>> interfaces only in order to limit the scope of the KIP. > > >>>>> > > >>>>> > > >>>> > > >>> > > >> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-889%3A+Versioned+State+Stores > > >>>>> > > >>>>> Thanks, > > >>>>> Victoria > > >>>>> > > >>>> > > >>> > > >> > > > > > >