Thanks for the KIP! Great to see a first step towards using the new
versioned stores!
I think the described tradeoffs make sense and I like make a pragmatic
step into the right direction, and avoid boiling the ocean. Thus, I
agree to the proposed solution.
One minor thing, that I believe just need clarification in the KIP (does
not seem to be a change to the KIP itself):
For stream-table joins, I think we need to elaborate that a `get(k, ts)`
call now might return `null` if the history retention of the store is
too short. For inner-joins it would result in no output record (ie,
stream input record is dropped). Would be good to have it mentioned in
the KIP explicitly.
We should also discuss how left-joins should work for this case. I think
it's ok (better) to include the stream record in the result if the
lookup returns `null` -- either because no key exist in the exiting
history for the provided timestamp, or (the actual case in question)
because we query older than available history. If you agree, can we add
this to the KIP?
For left-table-table joins, there seems to be no special impact, but it
should be called out, too. The lookup itself does not go into the
history of the table so no change here (as we don't have the "query
older than history case") -- and for out-of-order records, we just
"drop" them anyway, so no change for left-joins either I believe.
-Matthias
On 3/15/23 2:00 PM, Guozhang Wang wrote:
Sounds good to me. Thanks!
On Wed, Mar 15, 2023 at 12:07 PM Victoria Xia
<victoria....@confluent.io.invalid> wrote:
Thanks for kicking off the discussion, John and Guozhang!
Just one thing that might be out of scope: if users want to enable the
versioned table feature across the topology, should we allow them to do it
via a single config rather than changing the materialized object at each
place?
Yes, I think this would be a great usability improvement and am in favor of
introducing such a config. As long as the config defaults to using
unversioned stores (which makes sense anyway), there will be no
compatibility concerns with introducing the config in a future release.
It's out of scope for this particular KIP as a result, but can hopefully be
introduced as part of the next release after 3.5.
Best,
Victoria
On Wed, Mar 15, 2023 at 10:49 AM Guozhang Wang <guozhang.wang...@gmail.com>
wrote:
Thanks Victoria for the great writeup, with a thorough analysis and
trade-offs. I do not have any major questions about the proposal.
Just one thing that might be out of scope: if users want to enable the
versioned table feature across the topology, should we allow them to
do it via a single config rather than changing the materialized object
at each place? Maybe we can defer that for future discussions, but
just want to hear your thoughts.
Anyways, I think this proposal is great just as-is even if we agree to
do the configuration improvement later.
Guozhang
On Thu, Mar 9, 2023 at 7:52 PM John Roesler <vvcep...@apache.org> wrote:
Thanks for the KIP, Victoria!
I had some questions/concerns, but you addressed them in the Rejected
Alternatives section. Thanks for the thorough proposal!
-John
On Thu, Mar 9, 2023, at 18:59, Victoria Xia wrote:
Hi everyone,
I have a proposal for updating Kafka Streams's stream-table join and
table-table join semantics for the new versioned key-value state stores
introduced in KIP-889
<
https://cwiki.apache.org/confluence/display/KAFKA/KIP-889%3A+Versioned+State+Stores
.
Would love to hear your thoughts and suggestions.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-914%3A+Join+Processor+Semantics+for+Versioned+Stores
Thanks,
Victoria