Hi Dawid,

I just want to mention one of your response,

> What you described with
> 'format' = 'csv',
> 'csv.allow-comments' = 'true',
> 'csv.ignore-parse-errors' = 'true'
> would not work though as the `format` prefix is mandatory in the sources
as only the properties with format
>  will be passed to the format factory in majority of cases. We already
have some implicit contracts.

IIUC, in FLIP-95 and FLIP-122, the property key style are totally decided
by connectors, not the framework.
So I custom connector can define above properties, and extract the value of
'format', i.e. 'csv', to find the format factory.
And extract the properties with `csv.` prefix and remove the prefix, and
pass the properties (e.g. 'allow-comments' = 'true')
into the format factory to create format.

So there is no a strict guarantee to have a "nested JSON style" properties.
Users can still develop a custom connector with this
un-hierarchy properties and works well.

'format' = 'json',
'format.fail-on-missing-field' = 'false'

Best,
Jark


On Thu, 30 Apr 2020 at 14:29, Dawid Wysakowicz <dwysakow...@apache.org>
wrote:

> Hi all,
>
> I'd like to start with a comment that I am ok with the current state of
> the FLIP-122 if there is a strong preference for it. Nevertheless I still
> like the idea of adding `type` to the `format` to have it as `format.type`
> = `json`.
>
> I wanted to clarify a few things though:
>
> @Jingsong As far as I see it most of the users copy/paste the properties
> from the documentation to the SQL, so I don't think additional four
> characters are too cumbersome. Plus if you force the additional suffix onto
> all the options of a format you introduce way more boilerplate than if we
> added the `type/kind/name`
>
> @Kurt I agree that we cannot force it, but I think it is more of a
> question to set standards/implicit contracts on the properties. What you
> described with
> 'format' = 'csv',
> 'csv.allow-comments' = 'true',
> 'csv.ignore-parse-errors' = 'true'
>
> would not work though as the `format` prefix is mandatory in the sources
> as only the properties with format will be passed to the format factory in
> majority of cases. We already have some implicit contracts.
>
> @Forward I did not necessarily get the example. Aren't json and bson two
> separate formats? Do you mean you can have those two at the same time? Why
> do you need to differentiate the options for each? The way I see it is:
>
> ‘format(.name)' = 'json',
> ‘format.fail-on-missing-field' = 'false'
>
> or
>
> ‘format(.name)' = 'bson',
> ‘format.fail-on-missing-field' = 'false'
>
> @Benchao I'd be fine with any of name, kind, type(this we already had in
> the past)
>
> Best,
> Dawid
>
> On 30/04/2020 04:17, Forward Xu wrote:
>
> Here I have a little doubt. At present, our json only supports the
> conventional json format. If we need to implement json with bson, json with
> avro, etc., how should we express it?
> Do you need like the following:
>
> ‘format.name' = 'json',
>
> ‘format.json.fail-on-missing-field' = 'false'
>
>
> ‘format.name' = 'bson',
>
> ‘format.bson.fail-on-missing-field' = ‘false'
>
>
> Best,
>
> Forward
>
> Benchao Li <libenc...@gmail.com> <libenc...@gmail.com> 于2020年4月30日周四 上午9:58写道:
>
>
> Thanks Timo for staring the discussion.
>
> Generally I like the idea to keep the config align with a standard like
> json/yaml.
>
> From the user's perspective, I don't use table configs from a config file
> like yaml or json for now,
> And it's ok to change it to yaml like style. Actually we didn't know that
> this could be a yaml like
> configuration hierarchy. If it has a hierarchy, we maybe consider that in
> the future to load the
> config from a yaml/json file.
>
> Regarding the name,
> 'format.kind' looks fine to me. However there is another name from the top
> of my head:
> 'format.name', WDYT?
>
> Dawid Wysakowicz <dwysakow...@apache.org> <dwysakow...@apache.org> 
> 于2020年4月29日周三 下午11:56写道:
>
>
> Hi all,
>
> I also wanted to share my opinion.
>
> When talking about a ConfigOption hierarchy we use for configuring Flink
> cluster I would be a strong advocate for keeping a yaml/hocon/json/...
> compatible style. Those options are primarily read from a file and thus
> should at least try to follow common practices for nested formats if we
> ever decide to switch to one.
>
> Here the question is about the properties we use in SQL statements. The
> origin/destination of these usually will be external catalog, usually in
>
> a
>
> flattened(key/value) representation so I agree it is not as important as
>
> in
>
> the aforementioned case. Nevertheless having a yaml based catalog or
>
> being
>
> able to have e.g. yaml based snapshots of a catalog in my opinion is
> appealing. At the same time cost of being able to have a nice
> yaml/hocon/json representation is just adding a single suffix to a
> single(at most 2 key + value) property. The question is between `format`
>
> =
>
> `json` vs `format.kind` = `json`. That said I'd be slighty in favor of
> doing it.
>
> Just to have a full picture. Both cases can be represented in yaml, but
> the difference is significant:
> format: 'json'
> format.option: 'value'
>
> vs
> format:
>     kind: 'json'
>
>     option: 'value'
>
> Best,
> Dawid
>
> On 29/04/2020 17:13, Flavio Pompermaier wrote:
>
> Personally I don't have any preference here.  Compliance wih standard
>
> YAML
>
> parser is probably more important
>
> On Wed, Apr 29, 2020 at 5:10 PM Jark Wu <imj...@gmail.com> <imj...@gmail.com> 
> wrote:
>
>
> From a user's perspective, I prefer the shorter one "format=json",
>
> because
>
> it's more concise and straightforward. The "kind" is redundant for
>
> users.
>
> Is there a real case requires to represent the configuration in JSON
> style?
> As far as I can see, I don't see such requirement, and everything works
> fine by now.
>
> So I'm in favor of "format=json". But if the community insist to follow
> code style on this, I'm also fine with the longer one.
>
> Btw, I also CC user mailing list to listen more user's feedback.
>
> Because I
>
> think this is relative to usability.
>
> Best,
> Jark
>
> On Wed, 29 Apr 2020 at 22:09, Chesnay Schepler <ches...@apache.org> 
> <ches...@apache.org>
> wrote:
>
>
>  > Therefore, should we advocate instead:
>  >
>  > 'format.kind' = 'json',
>  > 'format.fail-on-missing-field' = 'false'
>
> Yes. That's pretty much it.
>
> This is reasonable important to nail down as with such violations I
> believe we could not actually switch to a standard YAML parser.
>
> On 29/04/2020 16:05, Timo Walther wrote:
>
> Hi everyone,
>
> discussions around ConfigOption seem to be very popular recently.
>
> So I
>
> would also like to get some opinions on a different topic.
>
> How do we represent hierarchies in ConfigOption? In FLIP-122, we
> agreed on the following DDL syntax:
>
> CREATE TABLE fs_table (
>  ...
> ) WITH (
>  'connector' = 'filesystem',
>  'path' = 'file:///path/to/whatever',
>  'format' = 'csv',
>  'format.allow-comments' = 'true',
>  'format.ignore-parse-errors' = 'true'
> );
>
> Of course this is slightly different from regular Flink core
> configuration but a connector still needs to be configured based on
> these options.
>
> However, I think this FLIP violates our code style guidelines
>
> because
>
> 'format' = 'json',
> 'format.fail-on-missing-field' = 'false'
>
> is an invalid hierarchy. `format` cannot be a string and a top-level
> object at the same time.
>
> We have similar problems in our runtime configuration:
>
> state.backend=
> state.backend.incremental=
> restart-strategy=
> restart-strategy.fixed-delay.delay=
> high-availability=
> high-availability.cluster-id=
>
> The code style guide states "Think of the configuration as nested
> objects (JSON style)". So such hierarchies cannot be represented in
>
> a
>
> nested JSON style.
>
> Therefore, should we advocate instead:
>
> 'format.kind' = 'json',
> 'format.fail-on-missing-field' = 'false'
>
> What do you think?
>
> Thanks,
> Timo
>
> [1]
>
>
> https://flink.apache.org/contributing/code-style-and-quality-components.html#configuration-changes
>
> --
>
> Benchao Li
> School of Electronics Engineering and Computer Science, Peking 
> UniversityTel:+86-15650713730
> Email: libenc...@gmail.com; libenc...@pku.edu.cn
>
>

Reply via email to