I think the "scenario trace" is very nice, but has one point that I found confusing:
You indicate a retraction in the join output as (k,null) and a join result as (k, leftval, rightval), but confusingly, you also write a join result as (k, JoinResult) when one side is null. Maybe just call it as (k, leftval, null) or (k, null, rightval)? That way the readers can more easily determine if the results meet their expectations for each join type. (procedural note: if you update the KIP, you might want to send a new "diff link" to this thread, since the one I posted at the beginning would not automatically show your latest changes) I was initially concerned that the proposed algorithm would wind up propagating something that looks like a left join (k, leftval, null) under the case that Joe pointed out, but after reviewing your scenario, I see that it will emit a tombstone (k, null) instead. This is appropriate, and unavoidable, since we have to retract the join result from the logical view (the join result is a logical Table). Looking closely at the proposal, can you explain more about the propagateIfNull field in SubscriptionResponseWrapper? It sort of looks like it's always going to be equal to (RHS-result != null). In other words, can we drop that field and just send back RHS-result or null, and then handle it on the left-hand side like: if (rhsOriginalValueHash doesn't match) { emit nothing, just drop the update } else if (joinType==inner && rhsValue == null) { emit tombstone } else { emit joiner(lhsValue, rhsValue) } To your concern about emitting extra tombstones, personally, I think it's fine. Clearly, we should try to avoid unnecessary tombstones, but all things considered, it's not harmful to emit some unnecessary tombstones: their payload is small, and they are trivial to handle downstream. If users want to, they can materialize the join result to suppress any extra tombstones, so there's a way out. Thanks for the awesome idea. It's better than what I was thinking. -john On Wed, Jun 26, 2019 at 11:37 AM Adam Bellemare <adam.bellem...@gmail.com> wrote: > > Thanks John. > > I'm looking forward to any feedback on this. In the meantime I will work on > the unit tests to ensure that we have well-defined and readable coverage. > > At the moment I cannot see a way around emitting (k,null) whenever we emit > an event that lacks a matching foreign key on the RHS, except in the > (k,null) -> (k,fk) case. > If this LHS oldValue=null, we know we would have emitted a deletion and so > (k,null) would be emitted out of the join. In this case we don't need to > send another null. > > Adam > > On Wed, Jun 26, 2019 at 11:53 AM John Roesler <j...@confluent.io> wrote: > > > Hi Adam, > > > > Thanks for the proposed revision to your KIP > > ( > > https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=74684836&selectedPageVersions=77&selectedPageVersions=74 > > ) > > > > in response to the concern pointed out during code review > > (https://github.com/apache/kafka/pull/5527#issuecomment-505137962) > > > > We should have a brief discussion thread (here) in the mailing list to > > make sure everyone who wants to gets a chance to consider the > > modification to the design. > > > > Thanks, > > -John > >