Hi,
I am very exited about all of this in general. Sadly I haven’t had the
time to really take a deep look. One thing that is/was always a
difficult topic to resolve many to many relationships in table x table x
table joins is the repartitioning that has to happen at some point.
From the documentation I saw this:
"The *keys* of data records determine the partitioning of data in both
Kafka and Kafka Streams, i.e. how data is routed to specific partitions
within topics."
This feels unnecessarily restrictive as i can't currently imagin how to
resolve many to many relationships with this. One can also emmit every
record to many partitions to make up for no read replicas in kafka
aswell as partitioning schemes that don't work like this (Shards
processing overlapping key spaces).
I would really love to hear your thoughts on these topics. Great work!
Google grade technologies for everyone!
I <3 logs
On 10.03.2016 22:26, Jay Kreps wrote:
Hey all,
Lot's of people have probably seen the ongoing work on Kafka Streams
happening. There is no real way to design a system like this in a vacuum,
so we put up a blog, some snapshot docs, and something you can download and
use easily to get feedback:
http://www.confluent.io/blog/introducing-kafka-streams-stream-processing-made-simple
We'd love comments or thoughts from anyone...
-Jay