Re: KTable DSL join

Michael Noll Thu, 14 Jul 2016 01:36:12 -0700

Srikant,

> Its not a case for join() as the keys don't match. Its more a lookup
table.


Yes, the semantics of streaming joins in Kafka Streams are bit different
from joins in traditional RDBMS.
See
http://docs.confluent.io/3.0.0/streams/developer-guide.html#joining-streams.

Also, a heads up:  It turned out that user questions around joins in Kafka
Streams are pretty common.  We are currently working on improving the
documentation for joins to make this more clear.

Best wishes,
Michael



On Thu, Jul 14, 2016 at 10:12 AM, Matthias J. Sax <matth...@confluent.io>
wrote:

> I would recommend re-partitioning as described in Option 2.
>
> -Matthias
>
> On 07/13/2016 10:53 PM, Srikanth wrote:
> > Hello,
> >
> > I'm trying the following join using KTable. There are two change log
> tables.
> > Table1
> >   111 -> aaa
> >   222 -> bbb
> >   333 -> aaa
> >
> > Table2
> >   aaa -> 999
> >   bbb -> 888
> >   ccc -> 777
> >
> > My result table should be
> >   111 -> 999
> >   222 -> 888
> >   333 -> 999
> >
> > Its not a case for join() as the keys don't match. Its more a lookup
> table.
> >
> > Option1 is to use a Table1.toStream().process(ProcessSupplier(),
> > "storeName")
> > punctuate() will use regular kafka consumer that reads updates from
> Table2
> > and updates a private map.
> > Process() will do a key-value lookup.
> > This has an advantage when Table1 is much larger than Table2.
> > Each instance of the processor will have to hold entire Table2.
> >
> > Option2 is to re-partition Table1 using through(StreamPartitioner) and
> > partition using value.
> > This will ensure co-location. Then join with Table2. This part might be
> > tricky??
> >
> > Your comments and suggestions are welcome!
> >
> > Srikanth
> >
>
>


-- 

*Michael G. Noll | Product Manager | Confluent | +1 650.453.5860Download
Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*

Re: KTable DSL join

Reply via email to