Hi,
Regarding tombstones:
As far as I understand, we need to add either a validTo field to
VersionedRecord or we need to return tombstones, otherwise the result is
not complete, because users could never know a record was deleted at
some point before the second non-null value was put.
I like more adding the validTo field since it makes the result more
concise and easier interpretable.
Extending on Victoria's example, with the following puts
put(k, v1, time=0)
put(k, null, time=5)
put(k, null, time=10)
put(k, null, time=15)
put(k, v2, time=20)
the result with tombstones would be
value, timestamp
(v1, 0)
(null, 5)
(null, 10)
(null, 15)
(v2, 20)
instead of
value, timestamp, validTo
(v1, 0, 5)
(v2, 20, null)
The benefit of conciseness would already apply to one single tombstone.
On the other hand, why would somebody write consecutive tombstones into
a versioned state store? I guess if somebody does that on purpose, then
there should be a way to retrieve each of those tombstones, right?
So maybe we need both -- validTo field and the option to return
tombstones. The latter might be moved to a future KIP in case we see the
need.
Regarding .within(fromTs, toTs):
I would keep it simple with .from() and .asOfTimestamp() (or .until()).
If we go with .within(), I would opt for .withinTimeRange(fromTs, toTs),
because the query becomes more readable:
MultiVersionedKeyQuery
.withKey(1)
.withinTimeRange(Instant.parse(2023-08-03T10:37:30.00Z),
Instant.parse(2023-08-04T10:37:30.00Z))
If we stay with .from() and .until(), we should consider .fromTime() and
.untilTime() (or .toTime()):
MultiVersionedKeyQuery
.withKey(1)
.fromTime(Instant.parse(2023-08-03T10:37:30.00Z))
.untilTime(Instant.parse(2023-08-04T10:37:30.00Z))
Regarding asOf vs. until:
I think asOf() is more used in point in time queries as Walker mentioned
where this KIP specifies a time range. IMO asOf() fits very well with
KIP-960 where one version is queried, but here I think .until() fits
better. That might just be a matter of taste and in the end I am fine
with both as long as it is well documented.
Regarding getters without "get":
In the other IQv2 classes we used getters with "get". In general, we
tend to move away from using getters without "get", recently. So I would
keep the "get".
Best,
Bruno
On 10/3/23 7:49 PM, Walker Carlson wrote:
Hey Alieh thanks for the KIP,
Weighing in on the AsOf vs Until debate I think either is fine from a
natural language perspective. Personally AsOf makes more sense to me where
until gives me the idea that the query is making a change. It's totally a
connotative difference and not that important. I think as of is pretty
frequently used in point of time queries.
Also for these methods it makes sense to drop the "get" We don't
normally use that in getters
* The key that was specified for this query.
*/
public K getKey();
/**
* The starting time point of the query, if specified
*/
public Optional<Instant> getFromTimestamp();
/**
* The ending time point of the query, if specified
*/
public Optional<Instant> getAsOfTimestamp();
Other than that I didn't have too much to add. Overall I like the direction
of the KIP and think the funcatinlyt is all there!
best,
Walker
On Mon, Oct 2, 2023 at 10:46 PM Matthias J. Sax <mj...@apache.org> wrote:
Thanks for the updated KIP. Overall I like it.
Victoria raises a very good point, and I personally tend to prefer (I
believe so does Victoria, but it's not totally clear from her email) if
a range query would not return any tombstones, ie, only two records in
Victoria's example. Thus, it seems best to include a `validTo` ts-field
to `VersionedRecord` -- otherwise, the retrieved result cannot be
interpreted correctly.
Not sure what others think about it.
I would also be open to actually add a `includeDeletes()` (or
`includeTombstones()`) method/flag (disabled by default) to allow users
to get all tombstone: this would only be helpful if there are two
consecutive tombstone though (if I got it right), so not sure if we want
to add it or not -- it seems also possible to add it later if there is
user demand for it, so it might be a premature addition as this point?
Nit:
the public interface ValueIterator is used
"is used" -> "is added" (otherwise it sounds like as if `ValueIterator`
exist already)
Should we also add a `.within(fromTs, toTs)` (or maybe some better
name?) to allow specifying both bounds at once? The existing
`RangeQuery` does the same for specifying the key-range, so might be
good to add for time-range too?
-Matthias
On 9/6/23 5:01 AM, Bruno Cadonna wrote:
In my last e-mail I missed to finish a sentence.
"I think from a KIP"
should be
"I think the KIP looks good!"
On 9/6/23 1:59 PM, Bruno Cadonna wrote:
Hi Alieh,
Thanks for the KIP!
I think from a KIP
1.
I propose to throw an IllegalArgumentException or an
IllegalStateException for meaningless combinations. In any case, the
KIP should specify what exception is thrown.
2.
Why does not specifying a range return the latest version? I would
expect that it returns all versions since an empty lower or upper
limit is interpreted as no limit.
3.
I second Matthias comment about replacing "asOf" with "until" or "to".
4.
Do we need "allVersions()"? As I said above I would return all
versions if no limits are specified. I think if we get rid of
allVersions() there might not be any meaningless combinations anymore.
If a user applies twice the same limit like for example
MultiVersionedKeyQuery.with(key).from(t1).from(t2) the last one wins.
5.
Could you add some more examples with time ranges to the example
section?
6.
The KIP misses the test plan section.
7.
I propose to rename the class to "MultiVersionKeyQuery" since we are
querying multiple versions of the same key.
8.
Could you also add withAscendingTimestamps()? IMO it gives users the
possibility to make their code more readable instead of only relying
on the default.
Best,
Bruno
On 8/17/23 4:13 AM, Matthias J. Sax wrote:
Thanks for splitting this part into a separate KIP!
For `withKey()` we should be explicit that `null` is not allowed.
(Looking into existing `KeyQuery` it seems the JavaDocs don't cover
this either -- would you like to do a tiny cleanup PR for this, or
fix on-the-side in one of your PRs?)
The key query returns all the records that are valid in the time
range starting from the timestamp {@code fromTimestamp}.
In the JavaDocs you use the phrase `are valid` -- I think we need to
explain what "valid" means? It might even be worth to add some
examples. It's annoying, but being precise if kinda important.
With regard to KIP-962, should we allow `null` for time bounds ? The
JavaDocs should also be explicit if `null` is allowed or not and what
the semantics are if allowed.
You are using `asOf()` however, because we are doing time-range
queries, to me using `until()` to describe the upper bound would
sound better (I am not a native speaker though, so maybe I am off?)
The key query returns all the records that have timestamp <= {@code
asOfTimestamp}.
This is only correct if not lower-bound is set, right?
In your reply to KIP-960 you mentioned:
the meaningless combinations are prevented by throwing exceptions.
We should add corresponding JavaDocs like:
@throws IllegalArgumentException if {@code fromTimestamp} is
equal or
larger than {@code untilTimestamp}
Or something similar.
With regard to KIP-960: if we need to introduce a `VersionedKeyQuery`
class for single-key-single-ts lookup, would we need to find a new
name for the query class of this KIP, given that the return type is
different?
-Matthias
On 8/16/23 10:57 AM, Alieh Saeedi wrote:
Hi all,
I splitted KIP-960
<
https://cwiki.apache.org/confluence/display/KAFKA/KIP-960%3A+Support+single-key_single-timestamp+interactive+queries+%28IQv2%29+for+versioned+state+stores
into three separate KIPs. Therefore, please continue discussions
about single-key, multi-timestamp interactive queries here. You can
see all
the addressed reviews on the following page. Thanks in advance.
KIP-968: Support single-key_multi-timestamp interactive queries
(IQv2) for
versioned state stores
<
https://cwiki.apache.org/confluence/display/KAFKA/KIP-968%3A+Support+single-key_multi-timestamp+interactive+queries+%28IQv2%29+for+versioned+state+stores
I look forward to your feedback!
Cheers,
Alieh