Hello, The KTable join semantics is not exactly the same with that of a RDBMS. You can fine detailed semantics in the web docs (search for Joining Streams):
http://docs.confluent.io/3.0.0/streams/developer-guide.html#kafka-streams-dsl In a nutshell, the joiner will be triggered only if both / left / either of the joining streams has the matching record with the key of the incoming received record (so the input values of the joiner could not be null / can be null for only the other value / can be null on either values, but not both), and otherwise a pair of {join-key, null} is output. We made this design deliberately just to make sure that "table-table joins are eventually consistent". This gives a kind of resilience to late arrival of records that a late arrival in either stream can "update" the join result. Guozhang On Mon, Jul 4, 2016 at 6:10 PM, Philippe Derome <phder...@gmail.com> wrote: > Same happens for regular join, keys that appear only in one stream will > make it to output KTable tC with a null for either input stream. I guess > it's related to Kafka-3911 Enforce ktable Materialization or umbrella JIRA > 3909, Queryable state for Kafka Streams? > > On Mon, Jul 4, 2016 at 8:45 PM, Philippe Derome <phder...@gmail.com> > wrote: > > > If we have two streams A and B for which we associate tables tA and tB, > > then create a table tC as ta.leftJoin(tB, <some value joiner>) and then > we > > have a key kB in stream B but never made it to tA nor tC, do we need to > > inject a pair (k,v) of (kB, null) into resulting change log for tC ? > > > > It sounds like it is definitely necessary if key kB is present in table > tC > > but if not, why add it? > > > > I have an example that reproduces this and would like to know if it is > > considered normal, sub-optimal, or a defect. I don't view it as normal > for > > time being, particularly considering stream A as having very few keys > and B > > as having many, which could lead to an unnecessary large change log for > C. > > > > Phil > > > -- -- Guozhang