Jon,

the recently introduced GlobalKTable ("global tables") allow you to perform
non-key lookups.
See
http://docs.confluent.io/current/streams/developer-guide.html#kstream-globalktable-join
(and the javadocs link)

> So called "internal" values can't be looked up.

If I understand you correctly:  GlobalKTables allow you to do that.  You
can provide a KeyValueMapper with which you can tell Kafka Streams against
which on-the-fly-computed "new key" the global-table lookup should be
performed.  For example, if your GlobalKTable has String keys and JSON
values, you can perform lookups against particular fields in the JSON
payload.

-Michael


On Thu, Apr 20, 2017 at 2:20 PM, Jon Yeargers <jon.yearg...@cedexis.com>
wrote:

> Id like to further my immersion in kafka-as-database by doing more
> extensive key/val joins. Specifically there are many instances in the DB
> world where one is given a numeric field and needs to lookup the
> appropriate string translation / value. Imagine a record of student/class
> data where al the courses are numbered and one must determine class /
> instructor names for a hard copy.
>
> Something akin to
>
> select <some columns> from schedules
>    left join classes on schedules.classid = classes.id
>    left join teachers on schedules.teacherid = teachers.id
>    left join textbooks on schedules.textbookid = textbooks.id
>
> ... and so on.
>
> In the KTable world (AFIACT) this is only possible for the key the source
> record uses. So called "internal" values can't be looked up. I could
> imagine running each record through a 'map' cycle to rearrange the key for
> each lookup column, remap and repeat but this seems a bit onerous. Perhaps
> using a Process step one could use additional streams? Dunno.
>
> Using log-compaction these 'matching/lookup' topics could be kept
> available.
>
> Was reading this missive (
> https://cwiki.apache.org/confluence/display/KAFKA/
> Discussion%3A+Non-key+KTable-KTable+Joins).
> Seems like the right direction but misses this point.
>
> Any thoughts on this? Am I missing an obvious solution? (I hope so - this
> would be a cool use case)
>

Reply via email to