I think we should also discuss a little more about `validTo()` method?
Given that "latest" version does not have a valid-to TS, should we change the return type to `Optional` and return `empty()` for "latest"?
ATM the KIP uses `MAX_VALUE` for "latest" what seems to be less clean? We could also use `-1` (unknown), but both might be less expressive than `Optional`?
-Matthias On 11/20/23 1:59 AM, Bruno Cadonna wrote:
Hi Alieh,Although, I've already voted, I found a minor miss. You should also add a method isDescending() since the results could also be unordered now that we agreed that the results are unordered by default. If both -- isDescending() and isAscending -- are false neither withDescendingTimestamps() nor withAscendingTimestamps() was called.Best, Bruno On 11/17/23 11:25 AM, Alieh Saeedi wrote:Hi all, Thank you for the feedback. So we agreed on no default ordering for keys and TSs. So I must provide both withAscendingXx() and with DescendingXx() for the class.Apart from that, I think we can either remove the existing constructor forthe `VersionedRecord` class or follow the `Optional` thing. Since many hidden aspects of the KIP are quite clear now and we have come to a consensus about them, I think it 's time to vote ;-) I look forward to your votes. Thanks a lot. Cheers, Alieh On Fri, Nov 17, 2023 at 2:27 AM Matthias J. Sax <mj...@apache.org> wrote:Thanks, Alieh. Overall SGTM. About `validTo` -- wondering if we should make it an `Optional` and set to `empty()` by default? I am totally ok with going with the 3-way option about ordering using default "undefined". For this KIP (as it's all net new) nothing really changes. -- However, we should amend `RangeQuery`/KIP-985 to align it. Btw: so far we focused on key-ordering, but I believe the same "ordering undefined by default" would apply to time-ordering, too? This might affect KIP-997, too. -Matthias On 11/16/23 12:51 AM, Bruno Cadonna wrote:Hi, 80) We do not keep backwards compatibility with IQv1, right? I would even say that currently we do not need to keep backwards compatibility among IQv2 versions since we marked the API "Evolving" (do we only mean code compatibility here or also behavioral compatibility?). I propose to try to not limit ourselves for backwards compatibility that we explicitly marked as evolving. I re-read the discussion on KIP-985. In that discussion, we were quite focused on what the state store provides. I see that for range queries,we have methods on the state store interface that specify the order, butthat should be kind of orthogonal to the IQv2 query type. Let's assume somebody in the future adds a state store implementation that is not order based. To account for use cases where the order does not matter, this person might also add a method to the state store interface thatdoes not guarantee any order. However, our range query type is specifiedto guarantee order by default. So we need to add something like withNoOrder() to the query type to allow the use cases that does notneed order and has the better performance in IQ. That does not look verynice to me. Having the no-order-guaranteed option does not cost us anything and it keeps the IQv2 interface flexible. I assume we want to drop the Evolving annotation at some point. Sorry for not having brought this up in the discussion about KIP-985. Best, Bruno On 11/15/23 6:56 AM, Matthias J. Sax wrote:Just catching up on this one. 50) I am also in favor of setting `validTo` in VersionedRecord for single-key single-ts lookup; it seems better to return the proper timestamp. The timestamp is already in the store and it's cheap to extract it and add to the result, and it might be valuable information for the user. Not sure though if we should deprecate the existing constructor though, because for "latest" it's convenient to have? 60) Yes, I meant `VersionedRecord`. Sorry for the mixup. 80) We did discuss this question on KIP-985 (maybe you missed it Bruno). It's kinda tricky. Historically, it seems that IQv1, ie, the `ReadOnlyXxx` interfaces provide a clear contract that `range()` is ascending and `reverseRange()` is descending. For `RangeQuery`, the question is, if we did implicitly inherit this contract? Our conclusion on KIP-985 discussion was, that we did inherit it. If this holds true, changing the contract would be a breaking change (what might still be acceptable, given that the interface is annotated as unstable, and that IQv2 is not widely adopted yet). I am happy to go with the 3-option contract, but just want to ensure we all agree it's the right way to go, and we are potentially willing to pay the price of backward incompatibility.Do we need a snapshot semantic or can we specify a weaker but still useful semantic?I don't think we necessarily need it, but as pointed out by Lucas, all existing queries provide it. Overall, my main point is really about not implementing something "random", but defining a proper binding contract that allows users to reason about it. I general, I agree that weaker semantics might be sufficient, but I am not sure if we can implement anything weaker in a reasonable way? Happy to be convinced otherwise. (I have some example, that I will omit for now, as I hope we can actually go with snapshot semantics.) The RocksDB Snaptshot idea from Lucas sounds very promising. I was not aware that we could do this with RocksDB (otherwise I might have suggested it on the PR right away). I guess the only open question remaining would be, if we can provide the same guarantees for a future in-memory implementation for VersionedStores? It sounds possible to do, but we should have some level of confidence about it? -Matthias On 11/14/23 6:33 AM, Lucas Brutschy wrote:Hi Alieh, I agree with Bruno that a weaker guarantee could be useful already,and it's anyway possible to strengthen the guarantees later on. On theother hand, it would be nice to provide a consistent view across allsegments if it doesn't come with major downsides, because until now IQdoes provide a consistent view (via iterators), and this would be the first feature that diverges from that guarantee.I think a consistent could be achieved relatively easily by creating a snapshot (https://github.com/facebook/rocksdb/wiki/Snapshot) that willensure a consistent view on a single DB and using `ReadOptions.setSnapshot` for the gets. Since versioned state stores segments seem to be backed by a single RocksDB instance (that wasunclear to me during our earlier discussion), a single snapshot shouldbe enough to guarantee a consistent view - we just need to make sure to clean up the snapshots after use, similar to iterators. If we instead need a consistent view across multiple RocksDB instances, we may have to acquire a write lock on all of those (could use the current object monitors of our `RocksDB` interface) for the duration of creating snapshots across all instances - which I think would also be permissible performance-wise. Snapshots are just a sequence number and should be pretty lightweight to create (they have, however, downside when it comes to compaction just like iterators). With that in mind, I would be in favor of at least exploring the option of using snapshots for a consistent view here, before dropping this useful guarantee. Cheers, Lucas On Tue, Nov 14, 2023 at 2:20 PM Bruno Cadonna <cado...@apache.org> wrote:Hi Alieh, Regarding the semantics/guarantees of the query type: Do we need a snapshot semantic or can we specify a weaker but still useful semantic? An option could be to guarantee that:1. the query returns the latest version when the query arrived at thestate store2. the query returns a valid history, i.e., versions with adjacent andnon-overlapping validity intervals. I think guarantee 1 is quite obvious. Guarantee 2 maybe needs some explanation. If we do not avoid writes to the state store during the processing of interactive queries, it might for example happen thatthelatest version in the state store moves to data structures that areresponsible for older versions. In our RocksDB implementation that arethe segments. Thus, it could be that during query processing Kafka Streams reads the latest value x and encounters again x in a segment because a processor put a newer version of x in the versioned state store. A similar scenario might also happen to earlier versions. If Streams does not account for such cases it could return invalid histories. Maybe such weaker guarantees are enough for most use cases. You could consider implementing weaker guarantees as I described andifthere is demand propose stricter guarantees in a follow-up KIP. Maybe there are also other simpler guarantees that make sense. Best, Bruno On 11/9/23 12:30 PM, Bruno Cadonna wrote:Hi, Thanks for the updates! First my take on previous comments: 50) I am in favor of deprecating the constructor that does not take the validTo parameter. That implies that I am in favor of modifying get(key, asOf) to set the correct validTo. 60) I am in favor of renaming ValueIterator to VersionedRecordIteratoranddefine it as: public interface VersionedRecordIterator<V> extends Iterator<VersionedRecord<V>> (Matthias, you mixed up ValueAndTimestamp with VersionedRecord inyourlast e-mail, didn't you? Just double-checking if I understood whatyouare proposing.) 70) I agree with Matthias that adding a new method on the VersionedKeyValueStore interface defeats the purpose of one of the goals of IQv2, i.e., not to need to extend the state store interface for IQ. I would say if we do not need the method in normal processing, weshouldnot extend the public state store interface. BTW, nobody forces youtoStoreQueryUtils. I think that internal utils class was introduced forconvenience to leverage existing methods on the state storeinterface.80) Why do we limit ourselves by specifying a default order on theresult?Different state store implementation might have different strategies to store the records which affects the order in which the records arereturned if they are not sorted before they are returned to the user.Some users might not be interested in an order of the result and sothere is no reason those users pay the cost for sorting. I propose tonot specify a default order but sort the results (if needed) when withDescendingX() and withAscendingX() is specified on the query object.Regarding the snapshot guarantees for the iterators, I need to thinkabit more about it. I will come back to this thread in the next days.Best, Bruno On 11/8/23 5:30 PM, Alieh Saeedi wrote:Thank you, Bruno and Matthias, for keeping the discussion going and for reviewing the PR. Here are the KIP updates: - I removed the `peek()` from the `ValueIterator` interface since we do not need it. - Yes, Bruno, the `validTo` field in the `VersionedRecod` class is exclusive. I updated the javadocs for that. Very important critical open questions. I list them here based on priority (descendingly). - I implemented the `get(key, fromtime, totime)` method here <https://github.com/aliehsaeedii/kafka/blob/9578b7cb7cdade22cc63f671693f5aeb993937ca/streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBVersionedStore.java#L262:the problem is that this implementation does not guarantee consistency because processing might continue interleaved (no snapshot semantics isimplemented). More over, it materializes all results in memory. - Solution 1: Use a lock and release it after retrieving alldesired records from all segments. - positive point: snapshot semantics is implemented - negative points: 1) It is expensive since iterating over all segments may take a long time. 2) It still requires materializing results on memory - Solution 2: use `RocksDbIterator`. - positive points: 1) It guarantees snapshot segments. 2) It does not require materializing results in memory. - negative points: it is expensive because, anyway, we need to iterate over all (many) segments. Do you have any thoughts on this issue? (ref:Matthias'scomment <https://github.com/apache/kafka/pull/14626#pullrequestreview-1709280589>)- I added the field `validTo` in `VersionedRecord`. Its defaultvalue is MAX. But as Matthias mentioned, for the single-key single-ts (`VersionedKeyQuery` in KIP-960), it may not always be true. If the returned record belongs to an old segment, maybe it is notvalidany more. So MAX is not the correct value for `ValidTo`. Two solutions come to mind: - Solution 1: make the `ValidTo` as an `Optional` and set it to`empty` for the retuned result of `get(key, asOfTimestamp)`.- Solution 2: change the implementation of `get(key, asOfTimestamp)` so that it finds the correct `validTo` for the returned versionedRecord. - In this KIP and the next one, even though the default ordering is with ascending ts, I added the method `withAscendingTimestamps()` to have more user readibility (as Bruno suggested), while Hanyu did only add`withDescending...` methods (he did not need ascending becausethat's the default anyway). Matthias believes that we should not haveinconsistencies (he actually hates them:D). Shall I change myKIP or Hanyu? Thoughts? That would be maybe helpful to look into the PR <https://github.com/apache/kafka/pull/14626> for more clarity and even review that ;-) Cheers, Alieh On Thu, Nov 2, 2023 at 7:13 PM Bruno Cadonna <cado...@apache.org> wrote:Hi Alieh, First of all, I like the examples. Is validTo in VersionedRecord exclusive or inclusive? In the javadocs you write: "@param validTo the latest timestamp that value is valid" I think that is not true because the validity is defined by the starttime of the new version. The new and the old version cannot both bevalid at that same time. Theoretically, you could set validTo to the start time of the newversion - 1. However, what is the unit of the 1? Is it nanoseconds?Milliseconds? Seconds? Sure we could agree on one, but I think itismore elegant to just make the validTo exclusive. Actually, you used it as exclusive in your examples. Thanks for the KIP! Best, Bruno On 11/1/23 9:01 PM, Alieh Saeedi wrote:Hi all, @Matthias: I think Victoria was right. I must add the method `get(key, fromTime, toTime)` to the interface `VersionedKeyValueStore`.Rightnow, having the method only in `RocksDBVersionedStore`, made me to have an instance of `RocksDBVersionedStore` (instead of `VersionedKeyValueStore`) in `StoreQueryUtils.runMultiVersionedKeyQuery()` method. In future, wearegoing to use the same method for In-memory/SPDB/blaBla versioned stores. Then either this method won't work any more, or we have to add code (if clauses) for each type of versioned stores. What do you thinkaboutthat? Bests, Alieh On Tue, Oct 24, 2023 at 10:01 PM Alieh Saeedi <asae...@confluent.io>wrote:Thank you, Matthias, Bruno, and Guozhang for keeping the discussiongoing.Here is the list of changes I made: 1. I enriched the "Example" section as Bruno suggested. Do youpleasehave a look at that section? I think I devised an interesting one;-)2. As Matthias and Guozhang suggested, I renamed variables andmethodsas follows: - "fromTimestamp" -> "fromTime" - "asOfTimestamp" -> "toTime" - "from(Instant fromTime)" -> "fromTime(Instant fromTime)" - "asOf(Instant toTime)" -> "toTime(Instant toTime)" 3. About throwing an NPE when time bounds are `null`: Ok, Matthias, Iamconvinced by your idea. I consider a null timestamp as "nobound".But Ihad to change KIP-960 and the corresponding PR to be consistent aswell.Answering questions and some more discussion points: 1. For now, I keep the class names as they are. 2. About the new field "validTo" in VersionedRecord. Yes Matthias Ikeepthe old constructor, which does not have `validTo` as an inputparameter.But in the body of the old constructor, I consider the defaultvalue MAXfor the validTo field. I think MAX must be the default value for`validTo`since before inserting a tombstone or a new value for thesame key,thevalue must be valid till iternity. 3.Regarding changing the ValueIterator interface from `publicinterfaceValueIterator<V> extends Iterator<V>` to `public interfaceValueIterator<V> extends Iterator<VersionedRecord<V>>`: Matthias, Ido notknow how it improves type safety because the MultiVersionedKeyQueryclassreturns a ValueIterator of VersionedRecord any way. But if we wantto beconsistent with KeyValueIterator, we must apply the changes yousuggested.4. Regarding adding the new `get(key, fromTime, toTime)`methodto the public interface `VersionedKeyValueStore` or only adding it to the class `RocksDBVersionedStore`: In the KIP, I changed the interfaceasVictoria suggested. But still, I am not convinced why we do that.@Victoria:Do you please clarify why we have to define it in the interface?Morespecifically, why do we need to use generic VersionedKeyValueStore instead of simply using the implementing classes? Cheers, Alieh On Sat, Oct 14, 2023 at 3:35 AM Guozhang Wang <guozhang.wang...@gmail.com>wrote:Thanks Alieh for the KIP, as well as a nice summary of all the discussions! Just my 2c regarding your open questions: 1. I just checked KIP-889 (https://cwiki.apache.org/confluence/display/KAFKA/KIP-889%3A+Versioned+State+Stores) and we used "VersionedRecord<V> get(K key, longasOfTimestamp);",so I feel to be consistent with this API is better compared withbeingconsistent with "WindowKeyQuery"? 3. I agree with Matthias that naming is always tricky, and Ialsotend to be consistent with what we already have (even if in retro it may not be the best idea :P and if that was really becoming a complaint,we would change it across the board in one shot as well later).Guozhang On Wed, Oct 11, 2023 at 9:12 PM Matthias J. Sax <mj...@apache.org>wrote:Thanks for the update! Some thoughts about changes you made and open questions you raised: 10) About asOf vs until: I was just looking into `WindowKeyQuery`, `WindowRangeQuery` and also `ReadOnlyWindowStore` interfaces. Forthose,we use "timeFrom" and "timeTo", so it seems best to actuallyuse`to(Instant toTime)` to keep the naming consistent across the board? If yes, we should also do `from (Instant fromTime)` and use getters `fromTime()` and `toTime()` -- given that it's range bounds it seems acceptable to me, to diverge a little bit from KIP-960`asOfTimestamp()`-- but we could also rename it to `asOfTime()`? -- Given thatwestrongly type with `Instant` I am not worried about semanticambiguity.20) About throwing a NPE when time bounds are `null` -- why? (For thekey it makes sense as is mandatory to have a key.) Could we notinterpret `null` as "no bound". We did KIP-941 to add `null`foropen-ended `RangeQueries`, so I am wondering if we should just sticktothe same semantics? Cfhttps://cwiki.apache.org/confluence/display/KAFKA/KIP-941%3A+Range+queries+to+accept+null+lower+and+upper+bounds30) About the class naming. That's always tricky, and I am notmarried to my proposal. I agree with Bruno that the other suggested names are not really better. -- The underlying idea was, to get some"consistent"naming across the board. Existing `KeyQuery` New `VersionedKeyQuery` (KIP-960; we add a prefix)New `MultiVersionKeyQuery` (this KIP; extend the prefix with apre-prefix)Existing `RangeQuery` New `MultiVersionRangeQuery` (KIP-969; add same prefix asabove)40) I am fine with not adding `range(from, to)` -- it was just anidea.Some more follow up question: 50) You propose to add a new constructor and getter to`VersionedRecord`-- I am wondering if this implies that `validTo` is optional becausetheexisting constructor is not deprecated? -- Also, what happensif`validTo` is not set and `valueTo()` is called? Or do we intent tomake`validTo` mandatory? Maybe this question can only be answered when working on the code,but Iam wondering if we should make `validTo` mandatory or not...Andwhat the "blast radius" of changing `VersionedRecord` will be in general.Doyou have already some POC PR that we could look at to get somesignals about this? 60) The new query class is defined to return `ValueIterator<VersionedRecord<V>>` -- while I like the idea to add `ValueIterator<V>` in a generic way on the one hand, I am wondering if it might be better to change it, and enforce its usage (ie, returntype)of `VersionedRecord` to improve type safety (type erasure is often a pain, and we could mitigate it this way). Btw: We actually do a similar thing for `KeyValueIterator`. Ie, public interface ValueIterator<V> extendsIterator<ValueAndTimestamp<V>>and ValueAndTimestamp<V> peek(); This would imply that the return type of the new query is`ValueIterator<V>` on the interface what seems simpler and moreelegant?If we go with the change, I am also wondering if we need to find a better name for the new iterator class? Maybe `VersionIterator` or something like this? Of course it might limit the use of `ValueIterator` for other value types -- not sure if this a limitation that is prohibitive? My gut feeling is, that is should not be too limiting. 70) Do we really need the change in `VersionedKeyValueStore`andadd a new method? In the end, the idea of IQv2 is to avoid exactly this...Itwas the main issue for IQv1, that the base interface of the storeneededan update and thus all classed implementing the base interface,making it very cumbersome to add new query types. -- Of course, weneedthis new method on the actually implementation (as private method) that canbe called from `query()` method, but adding it to the interfaceseemstodefeat the purpose of IQv2. Note, for existing IQv2 queries types that go against others stores,thepublic methods already existed when IQv2 was introduces, and thus theimplementation of these query types just pragmatically re-usedexistingmethods -- but it does not imply that new public method should beadded.-Matthias On 10/11/23 5:11 AM, Bruno Cadonna wrote:Thanks for the updates, Alieh!The example in the KIP uses the allVersions() method which weagreedtoremove. Regarding your questions: 1. asOf vs. until: I am fine with both but slightly prefer until.2. What about KeyMultiVersionsQuery, KeyVersionsQuery (KIP-960would then be KeyVersionQuery). However, I am also fine with MultiVersionedKeyQuery since none of the names sounds betterorworsetome. 3. I agree with you not to introduce the method with the two boundstokeep things simple.4. Forget about fromTime() an asOfTime(), from() and asOf() isfine. 5. The main purpose is to show how to use the API. Maybe make anexamplewith just the key to distinguish this query from the single valuequeryof KIP-960 and then one with a key and a time range. When youiterateover the results you could also call validTo(). Maybe add someactual records in the comments to show what the result might look like. Regarding the test plan, I hope you also plan to add unit tests inallof your KIPs. Maybe you could also explain why system tests are not needed here. Best, Bruno On 10/10/23 5:36 PM, Alieh Saeedi wrote:Thank you all for the very exact and constructive comments. Ireally enjoyed reading your ideas and all the important points you made meawareof. I updated KIP-968 as follows: 1. If the key or time bounds are null, the method returns NPE.2. The "valid" word: I removed the sentence "all therecordsthat arevalid..." and replaced it with an exact explanation.Moreover, Iexplained it with an example in the KIP but not in the javadocs. Do Ineedto add the example to the javadocs as well?3. Since I followed Bruno's suggestion and removed theallVersions()method, the problem of meaningless combinations is solved, andIdo not need any IllegalArgumentException or something like that. Therefore, the change is that if no time bound is specified, thequeryreturns the records with the specified key for all timestamps (all versions). 4. As Victoria suggested, adding a method to the *VersionedKeyValueStore *interface is essential. So I did that. I had this method onlyin theRocksDBVersionedStore class, which was not enough.5. I added the *validTo* field to the VersionedRecordclass tobeable to represent the tombstones. As you suggested, we postponesolvingthe problem of retrieving consecutive tombstones forlater.6. I added the "Test Plan" section to all KIPs. I hopewhat I wrote is convincing. 7. I added the *withAscendingTimestamp()* method to providemorecode readability for the user. 8. I removed the evil word "get" from all getter methods. There have also been some more suggestions which I am still notconvincedor clear about them:1. Regarding asOf vs until: reading all comments, myconclusion was that I keep it as "asOf" (following Walker's idea as the nativespeakeras well as Bruno's suggestion to be consistent withsingle-key_single_tsqueries). But I do not have a personal preference. If you insist on"until",I change it. 2. Bruno suggested renaming the class "MultiVersionedKeyQuery"to sthelse. We already had a long discussion about the name with Matthias. I am open to renaming it to something else, but do you have anyideas?3. Matthias suggested having a method with two inputparametersthatenables the user to specify both time bounds in the samemethod.Isn't itintroducing redundancy? It is somehow disrespectful tothe ideaofhaving composable methods. 4. Bruno suggested renaming the methods "asOf" and "from" to "asOfTime" and "fromTime". If I do that, then it is not consistent withKIP-960.Moreover, the input parameter is clearly a timestamp, whichexplainsenough. What do you think about that? 5. I was asked to add more examples to the example section. My question is, what is the main purpose of that? If I know it clearly,thenIcan add what you mean. Cheers, Alieh On Tue, Oct 10, 2023 at 1:13 AM Matthias J. Sax <mj...@apache.org>wrote:Bruno and I had some background conversation about the `get`prefix question including a few other committers. The official policy was never changed, and we should not add the `get`-prefix. It's a slip on our side in previous KIPs to add the `get`-prefix and we should actually clean it up doing a follow upKIP.-Matthias On 10/5/23 5:26 AM, Bruno Cadonna wrote:Hi Matthias,Given all the IQv2 KIPs that use getX and given recent PRs(internalinterfaces mainly) that got merged, I was under the impressionthat wemoved away from the strict no-getX policy. I do not think it was an accident using getX in the IQv2 KIPssincesomebody would have brought it up, otherwise. I am fine with both types of getters. If we think, we need to discuss this in a broader context, let's start a separate thread. Best, Bruno On 10/5/23 7:44 AM, Matthias J. Sax wrote:I agree to (almost) everything what Bruno said.In general, we tend to move away from using getters without"get",recently. So I would keep the "get".This is new to me? Can you elaborate on this point? Why do youthinkthat's the case? I actually did realize (after Walker mentioned it) that existingquerytypes use `get` prefix, but to me it seems that it was byaccident andwe should consider correcting it? Thus, I would actually preferto notadd the `get` prefix for new methods query types. IMHO, we should do a follow up KIP to deprecate allmethodswith`get`prefix and replace them with new ones without `get` -- it's ofcoursealways kinda "unnecessary" noise, but if we don't do it,wemightgetinto more and more inconsistent naming what would result in a"bad"API. If we indeed want to change the convention and use the `get`prefix, Iwould strongly advocate to bit the bullet and do KIP topro-activelyadd the `get` "everywhere" it's missing... But overall, itseemsto bea much broader decision, and we should get buy in frommanycommittersabout it -- as long as there is no broad consensus to add `get` everywhere, I would strongly prefer not to diverge fromthecurrentagreement to omit `get`. -Matthias On 10/4/23 2:36 AM, Bruno Cadonna wrote:Hi, Regarding tombstones: As far as I understand, we need to add either a validTo field to VersionedRecord or we need to return tombstones, otherwise theresultis not complete, because users could never know a record wasdeletedat some point before the second non-null value was put. I like more adding the validTo field since it makes the resultmoreconcise and easier interpretable.Extending on Victoria's example, with the following putsput(k, v1, time=0) put(k, null, time=5) put(k, null, time=10) put(k, null, time=15) put(k, v2, time=20) the result with tombstones would be value, timestamp (v1, 0) (null, 5) (null, 10) (null, 15) (v2, 20) instead of value, timestamp, validTo (v1, 0, 5) (v2, 20, null) The benefit of conciseness would already apply to one single tombstone. On the other hand, why would somebody write consecutivetombstonesinto a versioned state store? I guess if somebody does that on purpose, then there should be a way to retrieve each of those tombstones, right?So maybe we need both -- validTo field and the option toreturntombstones. The latter might be moved to a future KIP incase weseethe need. Regarding .within(fromTs, toTs): I would keep it simple with .from() and .asOfTimestamp() (or .until()). If we go with .within(), I would opt for .withinTimeRange(fromTs, toTs), because the query becomes morereadable:MultiVersionedKeyQuery .withKey(1) .withinTimeRange(Instant.parse(2023-08-03T10:37:30.00Z), Instant.parse(2023-08-04T10:37:30.00Z))If we stay with .from() and .until(), we should consider.fromTime()and .untilTime() (or .toTime()): MultiVersionedKeyQuery .withKey(1).fromTime(Instant.parse(2023-08-03T10:37:30.00Z)) .untilTime(Instant.parse(2023-08-04T10:37:30.00Z))Regarding asOf vs. until: I think asOf() is more used in point in time queries as Walker mentioned where this KIP specifies a time range. IMO asOf() fitsverywell with KIP-960 where one version is queried, but hereIthink .until() fits better. That might just be a matter of taste andin theend I am fine with both as long as it is well documented.Regarding getters without "get":In the other IQv2 classes we used getters with "get". Ingeneral, wetend to move away from using getters without "get", recently. SoIwould keep the "get". Best, Bruno On 10/3/23 7:49 PM, Walker Carlson wrote:Hey Alieh thanks for the KIP, Weighing in on the AsOf vs Until debate I think eitherisfine from anatural language perspective. Personally AsOf makes moresenseto mewhere until gives me the idea that the query is making a change. It's totally a connotative difference and not that important. I thinkasof is pretty frequently used in point of time queries. Also for these methods it makes sense to drop the "get" Wedon'tnormally use that in getters * The key that was specified for this query. */ public K getKey(); /** * The starting time point of the query, if specified */ public Optional<Instant> getFromTimestamp(); /** * The ending time point of the query, if specified */ public Optional<Instant> getAsOfTimestamp(); Other than that I didn't have too much to add. Overall I likethedirection of the KIP and think the funcatinlyt is all there! best, Walker On Mon, Oct 2, 2023 at 10:46 PM Matthias J. Sax <mj...@apache.org>wrote:Thanks for the updated KIP. Overall I like it. Victoria raises a very good point, and I personally tend to prefer (I believe so does Victoria, but it's not totally clear from her email) if a range query would not return any tombstones, ie, only tworecordsinVictoria's example. Thus, it seems best to include a `validTo` ts-fieldto `VersionedRecord` -- otherwise, the retrieved resultcannotbeinterpreted correctly. Not sure what others think about it. I would also be open to actually add a `includeDeletes()` (or `includeTombstones()`) method/flag (disabled by default) toallowusers to get all tombstone: this would only be helpful if there aretwoconsecutive tombstone though (if I got it right), sonotsureif wewant to add it or not -- it seems also possible to add it later ifthereisuser demand for it, so it might be a premature addition asthispoint?Nit:the public interface ValueIterator is used"is used" -> "is added" (otherwise it sounds like as if`ValueIterator` exist already) Should we also add a `.within(fromTs, toTs)` (or maybe somebettername?) to allow specifying both bounds at once? The existing `RangeQuery` does the same for specifying the key-range, so might be good to add for time-range too? -Matthias On 9/6/23 5:01 AM, Bruno Cadonna wrote:In my last e-mail I missed to finish a sentence. "I think from a KIP" should be "I think the KIP looks good!" On 9/6/23 1:59 PM, Bruno Cadonna wrote:Hi Alieh, Thanks for the KIP! I think from a KIP 1.I propose to throw an IllegalArgumentException or anIllegalStateException for meaningless combinations. In anycase,theKIP should specify what exception is thrown. 2. Why does not specifying a range return the latest version? I would expect that it returns all versions since an empty lower orupperlimit is interpreted as no limit. 3.I second Matthias comment about replacing "asOf" with"until" or"to". 4. Do we need "allVersions()"? As I said above I would returnallversions if no limits are specified. I think if we get ridofallVersions() there might not be any meaninglesscombinationsanymore. If a user applies twice the same limit like for example MultiVersionedKeyQuery.with(key).from(t1).from(t2)thelastonewins. 5. Could you add some more examples with time ranges to theexamplesection?6. The KIP misses the test plan section. 7. I propose to rename the class to "MultiVersionKeyQuery"since wearequerying multiple versions of the same key. 8. Could you also add withAscendingTimestamps()? IMO it givesusersthepossibility to make their code more readable instead of onlyrelyingon the default. Best, Bruno On 8/17/23 4:13 AM, Matthias J. Sax wrote:Thanks for splitting this part into a separate KIP!For `withKey()` we should be explicit that `null` is not allowed. (Looking into existing `KeyQuery` it seems the JavaDocsdon'tcoverthis either -- would you like to do a tiny cleanup PR for this, or fix on-the-side in one of your PRs?)The key query returns all the records that are valid inthetime range starting from the timestamp {@code fromTimestamp}.In the JavaDocs you use the phrase `are valid` -- I thinkweneed to explain what "valid" means? It might even be worth to addsomeexamples. It's annoying, but being precise if kindaimportant.With regard to KIP-962, should we allow `null` for timebounds ?The JavaDocs should also be explicit if `null` is allowed ornot andwhat the semantics are if allowed.You are using `asOf()` however, because we are doingtime-rangequeries, to me using `until()` to describe the upperboundwouldsound better (I am not a native speaker though, so maybe Iamoff?)The key query returns all the records that have timestamp<={@code asOfTimestamp}.This is only correct if not lower-bound is set, right? In your reply to KIP-960 you mentioned:the meaningless combinations are prevented by throwingexceptions.We should add corresponding JavaDocs like:@throws IllegalArgumentException if {@codefromTimestamp} is equal or largerthan{@code untilTimestamp} Or something similar. With regard to KIP-960: if we need to introduce a `VersionedKeyQuery` class for single-key-single-ts lookup, would we need tofinda newname for the query class of this KIP, given that thereturntypeisdifferent? -Matthias On 8/16/23 10:57 AM, Alieh Saeedi wrote:Hi all, I splitted KIP-960 <https://cwiki.apache.org/confluence/display/KAFKA/KIP-960%3A+Support+single-key_single-timestamp+interactive+queries+%28IQv2%29+for+versioned+state+storesinto three separate KIPs. Therefore, pleasecontinuediscussions about single-key, multi-timestamp interactive querieshere. Youcan see all the addressed reviews on the following page. Thanks inadvance.KIP-968: Support single-key_multi-timestamp interactivequeries(IQv2) for versioned state stores <https://cwiki.apache.org/confluence/display/KAFKA/KIP-968%3A+Support+single-key_multi-timestamp+interactive+queries+%28IQv2%29+for+versioned+state+storesI look forward to your feedback! Cheers, Alieh