Github user tzulitai commented on a diff in the pull request: https://github.com/apache/flink/pull/5922#discussion_r185452174 --- Diff: docs/dev/stream/state/broadcast_state.md --- @@ -0,0 +1,281 @@ +--- +title: "The Broadcast State Pattern" +nav-parent_id: streaming_state +nav-pos: 2 +--- +<!-- +Licensed to the Apache Software Foundation (ASF) under one +or more contributor license agreements. See the NOTICE file +distributed with this work for additional information +regarding copyright ownership. The ASF licenses this file +to you under the Apache License, Version 2.0 (the +"License"); you may not use this file except in compliance +with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + +Unless required by applicable law or agreed to in writing, +software distributed under the License is distributed on an +"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY +KIND, either express or implied. See the License for the +specific language governing permissions and limitations +under the License. +--> + +* ToC +{:toc} + +[Working with State](state.html) described operator state which is either **evenly** distributed among the parallel +tasks of an operator, or state which **upon restore**, its partial (task) states are **unioned** and the whole state is +used to initialize the restored parallel tasks. + +A third type of supported *operator state* is the *Broadcast State*. Broadcast state was introduced to support use-cases +where some data coming from one stream is required to be broadcasted to all downstream tasks, where it is stored locally +and is used to process all incoming elements on the other stream. As an example where broadcast state can emerge as a +natural fit, one can imagine a low-throughput stream containing a set of rules which we want to evaluate against all +elements coming from another stream. Having the above type of use-cases in mind, broadcast state differs from the rest +of operator states in that: + 1. it has a map format, + 2. it is only available to streams whose elements are *broadcasted*, --- End diff -- How about, "it is only available to operators which have a broadcasted input stream"? This might only be a matter of personal preference, so please take this as a grain of salt. It's just that I somehow find it easier to understand when thinking in terms of states and operators.
---