Thanks for the reply Chris. I got solution 1 more or less implemented while
getting my bearings. I then started looking into solution 2 and made some
progress, but now I'm starting to wonder how well the shared state store fits
our particular use-case. As I mentioned, we need to use a bootst
Hey Tommy,
Your summary sounds pretty accurate. One other way, which requires no
change to Samza, would be to repartition the input topic properly for each
task. This is kind of hacky, though.
(2) is the ideal solution. It is a bit of work, but it might not be so bad.
I think most of the changes
We have a Kafka topic containing data needed by several Samza jobs. These jobs
will essentially read the data and build up state that will be used for
processing their inputs. Ideally, we would use the topic as a bootstrap stream
to build up this state. The problem with that is the topic contai