Hi Paul, To add to Hequn's answer. Broadcast state can typically be used as "a low-throughput stream containing a set of rules which we want to evaluate against all elements coming from another stream" [1] So to add to the difference list is: whether it is "broadcast" across all keys if processing a keyed stream. This is typically when it is not possible to derive same key field using KeySelector in CoStream. Another additional difference is performance: BroadcastStream is "stored locally and is used to process all incoming elements on the other stream" thus requires to carefully manage the size of the BroadcastStream.
[1]: https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/state/broadcast_state.html On Sun, Aug 19, 2018 at 1:40 AM Hequn Cheng <chenghe...@gmail.com> wrote: > Hi Paul, > > There are some differences: > 1. The BroadcastStream can broadcast data for you, i.e, data will be > broadcasted to all downstream tasks automatically. > 2. To guarantee that the contents in the Broadcast State are the same > across all parallel instances of our operator, read-write access is only > given to the broadcast side > 3. For BroadcastState, flink guarantees that upon restoring/rescaling > there will be no duplicates and no missing data. In case of recovery with > the same or smaller parallelism, each task reads its checkpointed state. > Upon scaling up, each task reads its own state, and the remaining tasks > (p_new-p_old) read checkpoints of previous tasks in a round-robin manner. > While MapState doesn't have such abilities. > > Best, Hequn > > On Sun, Aug 19, 2018 at 11:18 AM, Paul Lam <paullin3...@gmail.com> wrote: > >> Hi, >> >> AFAIK, the difference between a BroadcastStream and a normal DataStream >> is that the BroadcastStream is with a BroadcastState, but it seems that the >> functionality of BroadcastState can also be achieved by MapState in a >> CoMapFunction or something since the control stream is still broadcasted >> without being turned into BroadcastStream. So, I’m wondering what’s the >> advantage of using BroadcastState? Thanks a lot! >> >> Best Regards, >> Paul Lam >> > >