Hi,

Thank you for the comments Kostas, Timo, Aljoscha. I also like the
pipeline/execution naming. I tried to apply most of your suggestions
Aljoscha.

There are a few cases when I did not. You mentioned a few options that
are already present, and I planned to reuse the existing options
(latencyTrackingInterval, latencyTrackingInterval, setParallelism etc.)

I would still expose the two options from your "Maybe don’t expose"
section. They are currently exposed in the Table API module (the initial
motivation of this FLIP to enable passing the config from Table module).
Moreover I think it is important for users to have an option to
configure the kryo serializers in a way.

I updated the FLIP's wiki page and will start voting on it.

Best,

Dawid

On 18/10/2019 17:19, Aljoscha Krettek wrote:
> Hi,
>
> In general, I’m also for “execution" compared to just “exec”. For some of 
> these options, though, I’m wondering whether “pipeline.<option>” or 
> “job.<option>” makes more sense. Over time, a lot of things have accumulated 
> in ExecutionConfig but a lot of them are not execution related, I think. For 
> example, auto-type-registration would make more sense as 
> “pipeline.auto-type-registration”. For some other options, I think we should 
> consider not exposing them via the configuration if we don’t think that we 
> want to have them in the long term.
>
> I’ll try to categorise what I think:
>
> Don’t expose:
>  - defaultInputDependencyConstraint (I think this is an internal flag for the 
> Blink runner)
>  - executionMode (I think this is also Blink internals)
>  - printProgressDuringExecution (I don’t know if this flag still does 
> anything)
>
> Maybe don’t expose:
>  - defaultKryoSerializerClasses
>  - setGlobalJobParameters (if we expose it it should be “pipeline”)
>
> pipeline/job:
>  - autoTypeRegistration
>  - autoWatermarkInterval
>  - closureCleaner
>  - disableGenericTypes
>  - enableAutoGeneratedUIDs
>  - forceAvro
>  - forceKryo
>  - setMaxParallelism
>  - setParallelism
>  - objectReuse (this one is hard, could be execution)
>  - registeredKryoTypes
>  - registeredPojoTypes
>  - timeCharacteristic
>  - isChainingEnabled
>  - cachedFile
>
> execution:
>  - latencyTrackingInterval
>  - setRestartStrategy
>  - taskCancellationIntervalMillis
>  - taskCancellationTimeoutMillis
>  - bufferTimeout
>
> checkpointing: (this might be “execution.checkpointing”)
>  - useSnapshotCompression
>  - <the other checkpointing settings in the doc>
>  - defaultStateBackend
>
> What do you think?
>
> Best,
> Aljoscha
>
>
>> On 17. Oct 2019, at 09:32, Timo Walther <twal...@apache.org> wrote:
>>
>> Sounds good to me.
>>
>> Thanks,
>>
>> Timo
>>
>>
>> On 17.10.19 09:30, Kostas Kloudas wrote:
>>> Hi Timo,
>>>
>>> I agree that distinguishing between "executor" and "execution" when
>>> scanning through a configuration file can be difficult. These names
>>> were mainly influenced by the fact that FLIP-73 introduced the
>>> "Executor".
>>> In addition, I agree that "deployment" or "deploy" sound good
>>> alternatives. Between the two, I would go with "deployment" (although
>>> I like more the "deploy" as it is more imperative) for the simple
>>> reason that we do not use verbs anywhere else (I think) in config
>>> options.
>>>
>>> Now for the "exec" or "execution", personally I like the longer
>>> version as it is clearer.
>>>
>>> So, to summarise, I would vote for "deployment", "execution", and
>>> "pipeline" for job invariants, like the jars.
>>>
>>> What do you think?
>>>
>>> Cheers,
>>> Kostas
>>>
>>> On Wed, Oct 16, 2019 at 5:28 PM Timo Walther <twal...@apache.org> wrote:
>>>> Hi Kostas,
>>>>
>>>> can we still discuss the naming of the properties? For me, having
>>>> "execution" and "exector" as prefixes might be confusing in the future
>>>> and difficult to identify if you scan through a list of properties.
>>>>
>>>> How about `deployment` and `execution`? Or `deployer` and `exec`?
>>>>
>>>> Regards,
>>>> Timo
>>>>
>>>> On 16.10.19 16:31, Kostas Kloudas wrote:
>>>>> Hi all,
>>>>>
>>>>> Thanks for opening the discussion!
>>>>>
>>>>> I like the idea, so +1 from my side and actually this is aligned with
>>>>> our intensions for the FLIP-73 effort.
>>>>>
>>>>> For the naming convention of the parameters introduced in the FLIP, my
>>>>> proposal would be have the full word "execution" instead of the
>>>>> shorter "exec".
>>>>> The reason for this, is that in the context of FLIP-73, we are also
>>>>> planning to introduce some new configuration parameters and the
>>>>> convention we
>>>>> are currently using is the following:
>>>>>
>>>>> pipeline.***: for job parameters that will not change between
>>>>> executions of the same job, e.g. the jar location
>>>>> executor.***: for parameters relevant to the instantiation of the
>>>>> correct executor, e.g. YARN, detached, etc
>>>>> execution.***: for parameters that are relevant to a specific
>>>>> execution of a given pipeline, e.g. parallelism or savepoint settings
>>>>>
>>>>> I understand that sometimes the boundaries may not be that clear for a
>>>>> parameter but I hope this will not be relevant to most of the
>>>>> parameters.
>>>>>
>>>>> I will also open a FLIP with some addition parameters but until then,
>>>>> this is the scheme that we are planning to follow.
>>>>>
>>>>> Cheers,
>>>>> Kostas
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Sep 2, 2019 at 9:26 AM Dawid Wysakowicz <dwysakow...@apache.org> 
>>>>> wrote:
>>>>>> Hi Gyula,
>>>>>>
>>>>>> Yes you are right, we were also considering the external configurer. The
>>>>>> reason we suggest the built in method is that it is more tightly coupled
>>>>>> with the place the options are actually set. Therefore our hope is that,
>>>>>> whenever somebody e.g. adds new fields to the ExecutionConfig he/she
>>>>>> updates also the configure method. I am not entirely against your
>>>>>> suggestion though, if this is the preferred way in the community.
>>>>>>
>>>>>> Does anyone has any comments regarding the option keys?
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>> Dawid
>>>>>>
>>>>>> On 30/08/2019 14:57, Gyula Fóra wrote:
>>>>>>> Hi Dawid,
>>>>>>>
>>>>>>> Sorry I misread one of the interfaces a little (Configuration instead of
>>>>>>> ConfigurationReader), you are right.
>>>>>>> I was referring to:
>>>>>>>
>>>>>>>
>>>>>>>     -
>>>>>>>
>>>>>>>     void StreamExecutionEnvironment.configure(ConfigurationReader)
>>>>>>>
>>>>>>>
>>>>>>> This might be slightly orthogonal to the changes that you made here but
>>>>>>> what I meant is that instead of adding methods to the
>>>>>>> StreamExecutionEnvironment we could make this an external interface:
>>>>>>>
>>>>>>> EnvironmentConfigurer {
>>>>>>>    void configure(StreamExecutionEnvironment, ConfigurationReader)
>>>>>>> }
>>>>>>>
>>>>>>> We could then have a default implementation of the EnvironmentConfigurer
>>>>>>> that would understand built in options.  We could also allow users to 
>>>>>>> pass
>>>>>>> custom implementations of this, which could configure the
>>>>>>> StreamExecutionEnvironment based on user defined config options. This is
>>>>>>> just a rough idea for extensibility and probably out of scope at first.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Gyula
>>>>>>>
>>>>>>> On Fri, Aug 30, 2019 at 12:13 PM Dawid Wysakowicz 
>>>>>>> <dwysakow...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Gyula,
>>>>>>>>
>>>>>>>> Thank you for the support on those changes.
>>>>>>>>
>>>>>>>> I am not sure if I understood your idea for the "reconfiguration" 
>>>>>>>> logic.
>>>>>>>>
>>>>>>>> The configure method on those objects would take ConfigurationReader. 
>>>>>>>> So
>>>>>>>> user can provide a thin wrapper around Configuration for e.g. filtering
>>>>>>>> certain logic, changing values based on other parameters etc. Is that
>>>>>>>> what you had in mind?
>>>>>>>>
>>>>>>>> Best,
>>>>>>>>
>>>>>>>> Dawid
>>>>>>>>
>>>>>>>> On 29/08/2019 19:21, Gyula Fóra wrote:
>>>>>>>>> Hi!
>>>>>>>>>
>>>>>>>>> Huuuge +1 from me, this has been an operational pain for years.
>>>>>>>>> This would also introduce a nice and simple way to extend it in the
>>>>>>>> future
>>>>>>>>> if we need.
>>>>>>>>>
>>>>>>>>> Ship it!
>>>>>>>>>
>>>>>>>>> Gyula
>>>>>>>>>
>>>>>>>>> On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz 
>>>>>>>>> <dwysakow...@apache.org
>>>>>>>>>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> I wanted to propose a new, additional way of configuring execution
>>>>>>>>>> parameters that can currently be set only on such objects like
>>>>>>>>>> ExecutionConfig, CheckpointConfig and StreamExecutionEnvironment. 
>>>>>>>>>> This
>>>>>>>>>> poses problems such as:
>>>>>>>>>>
>>>>>>>>>>     - no easy way to configure those from a file
>>>>>>>>>>     - there is no easy way to pass a configuration from layers built 
>>>>>>>>>> on
>>>>>>>>>>     top of StreamExecutionEnvironment. (e.g. when we want to 
>>>>>>>>>> configure
>>>>>>>> those
>>>>>>>>>>     options from TableEnvironment)
>>>>>>>>>>     - they are not automatically documented
>>>>>>>>>>
>>>>>>>>>> Note that there are a few concepts from FLIP-54[1] that this FLIP is
>>>>>>>> based
>>>>>>>>>> on.
>>>>>>>>>>
>>>>>>>>>> Would be really grateful to know if you think this would be a 
>>>>>>>>>> valuable
>>>>>>>>>> addition and any other feedback.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>>
>>>>>>>>>> Dawid
>>>>>>>>>>
>>>>>>>>>> Wiki page:
>>>>>>>>>>
>>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-59%3A+Enable+execution+configuration+from+Configuration+object
>>>>>>>>>> Google doc:
>>>>>>>>>>
>>>>>>>> https://docs.google.com/document/d/1l8jW2NjhwHH1mVPbLvFolnL2vNvf4buUMDZWMfN_hFM/edit?usp=sharing
>>>>>>>>>> [1]
>>>>>>>>>>
>>>>>>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-54%3A+Evolve+ConfigOption+and+Configuration
>>

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to