I think the proposed changed are good, I just wanted to make sure that they don’t interfere with what other people are doing.
I also proposed these steps on the Github PR: Also, for actually doing the changes I suggest separate steps, i.e. separate commits. With possibly separate PRs to make reviewing easier and to make the changes more isolated: - Rename StreamConfig to StreamTaskConfig and make it serialisable, instead of relying on an underlying Configuration. This means that the StreamTaskConfig itself has fields for storing settings. - Introduce OperatorConfig and move only those fields that the operator should see from StreamTaskConfig to OperatorConfig. Initialize the operator with an OperatorConfig. Regarding what to put in the OperatorConfig and what in the StreamTaskConfig: why are these still in the OperatorConfig? 2) streamOperator 3) input serializer. 4) output edges and serializers. 5) chain.index I think only the StreamTask, that is responsible for building the OperatorChain needs to have that information. Best, Aljoscha > On 4. Jul 2017, at 15:56, xu <xupingyong...@163.com> wrote: > > HI All: > I am sorry about working with > StreamConfig(https://github.com/apache/flink/pull/4241) which may conflicts > with others' work before discussing. > > Motivation: > A Task contains one or more operators with chainning, however > configs of operator and task are all put in StreamConfig. For example, when > an opeator is setup with the StreamConfig, it can see the interface about > physicalEdges or chained.task.configs, which are confused. Similarly a > streamTask should not see the interface about chain.index. > So we need to separate OperatorConfig from StreamConfig. A > streamTask inits with the streamConfig, and then extracts operatorConfigs > from it, build streamOperators with every operatorConfig. > > OperatorConfig: for the streamOperator to setup with, it constains > informations that only belong to the streamOperator. It contains: > 1) operator information: name, id > 2) streamOperator > 3) input serializer. > 4) output edges and serializers. > 5) chain.index > 6) state.key.serializer > > StreamConfig: for the streamTask to use: > 1) in.physical.edges > 2) out.physical.edges > 3) chained OperatorConfigs > 4) execution environment: checkpoint, state.backend and so on... > > Proposed Change > I propose overall changes: > 1) Builde jobGraph from streamGraph > 2) StreamOperator is setup with a operatorConfig, so the setup > interface need to change > > (1) Build jobGraph from streamGraph > When building, first we get every operatorConfig from the streamNode. > And then put operatorConfigs of streamNodes to a streamConfig when we chain > them to a jobVertex. > > (2) StreamOperator setup with OperatorProperties > An OperatorConfig is provided instead of streamConfig when the > streamOperator sets up. Thanks to the advice of StephanEwan, OperatorConfig > is no need to have a Map of "configKey" to values, just is a serializable > class with the respective fields, And StreamConfig still relys on an > underlying Configuration, because the streamConfig flows by its underlying > configuration. > > There are people who have already thought about this, maybe someone has > been working on it. I need your advice. > > Thanks a lot for replying and Best Regards. > > JiPing > > >