Yi thanks for the input. I've been out sick, so please excuse the delayed
response. I am still working out the use case with my team and will report
back next week.
Thanks!
On Tue, May 5, 2015 at 4:01 PM, Yi Pan wrote:
> Hi, Andreas,
>
> Are you describing a use case where the *same* copy of da
Hi, Andreas,
Are you describing a use case where the *same* copy of data is shared among
all tasks? That will depend on a lot factors:
1. is your data size huge?
2. Can your data be partitioned to work with a single partition of input
stream?
3. Do you have a means to bootstrap the data from a str
Hi Yan, thanks for the reply.
So yes, you are correct it would not be random which partition a message
hits. We would use a partition key (sorry I missed that).
The "data" I was referring to is the local KV-store data for each task. Is
there a way to synchronize or replicate the data from the KV-
Hi Andreas,
Not quite understand this part
"Because the messages coming into the input stream are random (i.e. can hit
any partition and therefore any task), each task will need its own copy of
the data (i.e. the data needs to be duplicated across each task)."
Messages come into the input stream