Regarding to version delimiter, I'm in favor of using "-", not "_". Not only because we are using "-" for config keys in other module, like CoreOptions, DeploymentOptions, but also "-" is easier to type than "_" which doesn't need to press "shift" ;-)
Best, Kurt On Tue, Mar 31, 2020 at 12:37 AM Jark Wu <imj...@gmail.com> wrote: > Hi all, > > Thanks for the feedbacks. > > It seems that we have a conclusion to put the version into the factory > identifier. I'm also fine with this. > If we have this outcome, the interface of Factory#factoryVersion is not > needed anymore, this can simplify the learning cost of new factory. > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther > <twal...@apache.org> > > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because > the meaning of "universal" not easy to understand. > kafka-0.11 => kafka for 0.11 version > kafka-0.10 => kafka for 0.10 version > elasticsearch-6 => elasticsearch for 6.x versions > elasticsearch-7 => elasticsearch for 7.x versions > hbase-1.4 => hbase for 1.4.x versions > jdbc > filesystem > > We use "-" as the version delimiter to make them to be more consistent. > This is not forces, users can also use other delimiters or without > delimiter. > But this can be a guilde in the Javadoc of Factory, to make the connector > ecosystem to be more consistent. > > What do you think? > > ------------------------ > > Regarding "connector.property-version": > > Hi @Dawid Wysakowicz <dwysakow...@apache.org> , the new fatories are > designed not support to read current properties. > All the current properties are routed to the old factories if they are > using "connector.type". Otherwise, properties are routed to new factories. > > If I understand correctly, the "connector.property-version" is attched > implicitly by system, not manually set by users. > For example, the framework should add "connector.property-version=1" to > properties when processing DDL statement. > I'm fine to add a "connector.property-version=1" when processing DDL > statement, but I think it's also fine if we don't, > because this can be done in the future if need and the default version can > be 1. > > Best, > Jark > > On Tue, 31 Mar 2020 at 00:36, Jark Wu <imj...@gmail.com> wrote: > > > Hi all, > > > > Thanks for the feedbacks. > > > > It seems that we have a conclusion to put the version into the factory > > identifier. I'm also fine with this. > > If we have this outcome, the interface of Factory#factoryVersion is not > > needed anymore, this can simplify the learning cost of new factory. > > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther > > <twal...@apache.org> > > > > Btw, I would like to use "_" instead of "-" as the version delimiter, > > because "-" looks like minus and may confuse users, e.g. > "elasticsearch-6". > > This is not forced, but should be a guilde in the Javadoc of Factory. > > I propose to use the following identifiers for existing connectors, > > > > kafka => kafka for 0.11+ versions, we don't suffix "-universal", because > > the meaning of "universal" not easy to understand. > > kafka-0.11 => kafka for 0.11 version > > kafka-0.10 => kafka for 0.10 version > > elasticsearch-6 => elasticsearch for 6.x versions > > elasticsearch-7 => elasticsearch for 7.x versions > > hbase-1.4 => hbase for 1.4.x versions > > jdbc > > filesystem > > > > We use "-" as the version delimiter to make them to be more consistent. > > This is not forces, users can also use other delimiters or without > > delimiter. > > But this can be a guilde in the Javadoc of Factory, to make the connector > > ecosystem to be more consistent. > > > > What do you think? > > > > ------------------------ > > > > Regarding "connector.property-version": > > > > Hi @Dawid Wysakowicz <dwysakow...@apache.org> , the new fatories are > > designed not support to read current properties. > > All the current properties are routed to the old factories if they are > > using "connector.type". Otherwise, properties are routed to new > factories. > > > > If I understand correctly, the "connector.property-version" is attched > > implicitly by system, not manually set by users. > > For example, the framework should add "connector.property-version=1" to > > properties when processing DDL statement. > > I'm fine to add a "connector.property-version=1" when processing DDL > > statement, but I think it's also fine if we don't, > > because this can be done in the future if need and the default version > can > > be 1. > > > > Best, > > Jark > > > > > > > > > > > > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <dwysakow...@apache.org> > > wrote: > > > >> Hi all, > >> > >> I like the overall design of the FLIP. > >> > >> As for the withstanding concerns. I kind of like the approach to put the > >> version into the factory identifier. I think it's the cleanest way to > >> say that this version actually applies to the connector itself and not > >> to the system it connects to. BTW, I think the outcome of this > >> discussion will affect interfaces described in FLIP-95. If we put the > >> version into the functionIdentifier, do we need Factory#factoryVersion? > >> > >> I also think it does make sense to have a versioning for the properties > >> as well. Are we able to read all the current properties with the new > >> factories? I think we could use the "connector.property-version" to > >> alternate between different Factory interfaces to support the old set of > >> properties. Otherwise the new factories need to understand both set of > >> properties, don't they? > >> > >> Best, > >> > >> Dawid > >> > >> On 30/03/2020 17:07, Timo Walther wrote: > >> > Hi Jark, > >> > > >> > thanks for the FLIP. I don't have a strong opinion on > >> > DynamicTableFactory#factoryVersion() for distiguishing connector > >> > versions. We can also include it in the factory identifier itself. For > >> > Kafka, I would then just use `kafka` because the naming "universal" > >> > was just a helper solution at that time. > >> > > >> > What we need is a good guide for how to design the options. We should > >> > include that in the JavaDoc of factory interface itself, once it is > >> > in-place. I like the difference between source and sink in properties. > >> > > >> > Regarding "connector.property-version": > >> > > >> > It was meant for backwards compatibility along the dimension of > >> > "property design" compared to "connector version". However, since the > >> > connector is now responsible for parsing all properties, it is not as > >> > necessary as it was before. > >> > > >> > However, a change of the property design as it is done in FLIP-107 > >> > becomes more difficult without a property version and versioning is > >> > always a good idea in API world. > >> > > >> > Regards, > >> > Timo > >> > > >> > > >> > On 30.03.20 16:38, Benchao Li wrote: > >> >> Hi Jark, > >> >> > >> >> Thanks for starting the discussion. The FLIP LGTM generally. > >> >> > >> >> Regarding connector version VS put version into connector's name, > >> >> I favor the latter personally, using one property to locate a > >> >> connector can make the error log more precise. > >> >> From the users' side, using one property to match a connector will > >> >> be easier. Especially we have many connectors, > >> >> and some of the need version property required, and some of them not. > >> >> > >> >> Regarding Jingsong's suggestion, > >> >> IMO, it's a very good complement to the FLIP. Distinguishing > >> >> properties for source and sink can be very useful, and > >> >> also this will make the connector property more precise. > >> >> We are also sick of this for now, we cannot know whether a DDL is a > >> >> source or sink unless we look through all queries where > >> >> the table is used. > >> >> Even more, some of the required properties are only required for > >> >> source, bug we cannot leave it blank for sink, and vice versa. > >> >> I think we can also add a type for dimension tables except source and > >> >> sink. > >> >> > >> >> Kurt Young <ykt...@gmail.com <mailto:ykt...@gmail.com>> 于2020年3月30日 > >> >> 周一 下午8:16写道: > >> >> > >> >> > It's not possible for the framework to throw such exception. > >> >> Because the > >> >> framework doesn't know what versions do the connector support. > >> >> > >> >> Not really, we can list all valid connectors framework could > >> >> found. E.g. > >> >> user mistyped 'kafka-0.x', the error message will looks like: > >> >> > >> >> we don't have any connector named "kafka-0.x", but we have: > >> >> FileSystem > >> >> Kafka-0.10 > >> >> Kafka-0.11 > >> >> ElasticSearch6 > >> >> ElasticSearch7 > >> >> > >> >> Best, > >> >> Kurt > >> >> > >> >> > >> >> On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <imj...@gmail.com > >> >> <mailto:imj...@gmail.com>> wrote: > >> >> > >> >> > Hi Kurt, > >> >> > > >> >> > > 2) Lists all available connectors seems also quite > >> >> straightforward, e.g > >> >> > user provided a wrong "kafka-0.8", we tell user we have > >> >> candidates of > >> >> > "kakfa-0.11", "kafka-universal" > >> >> > It's not possible for the framework to throw such exception. > >> >> Because the > >> >> > framework doesn't know what versions do the connector support. > >> >> All the > >> >> > version information is a blackbox in the identifier. But with > >> >> > `Factory#factoryVersion()` interface, we can know all the > >> >> supported > >> >> > versions. > >> >> > > >> >> > > 3) I don't think so. We can still treat it as the same > >> >> connector but with > >> >> > different versions. > >> >> > That's true but that's weird. Because from the plain DDL > >> >> definition, they > >> >> > look like different connectors with different "connector" > >> >> value, e.g. > >> >> > 'connector=kafka-0.8', 'connector=kafka-0.10'. > >> >> > > >> >> > > If users don't set any version, we will use > "kafka-universal" > >> >> instead. > >> >> > The behavior is inconsistent IMO. > >> >> > That is a long term vision when there is no kafka clusters > >> >> with <0.11 > >> >> > version. > >> >> > At that point, "universal" is the only supported version in > >> Flink > >> >> and the > >> >> > "version" key can be optional. > >> >> > > >> >> > --------------------------------------------- > >> >> > > >> >> > Hi Jingsong, > >> >> > > >> >> > > "version" vs "kafka.version" > >> >> > I though about it. But if we prefix "kafka" to version, we > >> should > >> >> prefix > >> >> > "kafka" for all other property keys, because they are all > kafka > >> >> specific > >> >> > options. > >> >> > However, that will make the property set verbose, see rejected > >> >> option#2 in > >> >> > the FLIP. > >> >> > > >> >> > > explicitly separate options for source and sink > >> >> > That's a good topic. It's good to have a guideline for the new > >> >> property > >> >> > keys. > >> >> > I'm fine to prefix with a "source"/"sink" for some connector > >> >> keys. > >> >> > Actually, we already do this in some connectors, e.g. jdbc and > >> >> hbase. > >> >> > > >> >> > Best, > >> >> > Jark > >> >> > > >> >> > On Mon, 30 Mar 2020 at 16:36, Jingsong Li < > >> jingsongl...@gmail.com > >> >> <mailto:jingsongl...@gmail.com>> wrote: > >> >> > > >> >> > > Thanks Jark for the proposal. > >> >> > > > >> >> > > +1 to the general idea. > >> >> > > > >> >> > > For "version", what about "kafka.version"? It is obvious to > >> >> know its > >> >> > > meaning. > >> >> > > > >> >> > > And I'd like to start a new topic: > >> >> > > Should we need to explicitly separate source from sink? > >> >> > > With the development of batch and streaming, more and more > >> >> connectors > >> >> > have > >> >> > > both source and sink. > >> >> > > > >> >> > > So should we set a rule for table properties: > >> >> > > - properties for both source and sink: without prefix, like > >> >> "topic" > >> >> > > - properties for source only: with "source." prefix, like > >> >> > > "source.startup-mode" > >> >> > > - properties for sink only: with "sink." prefix, like > >> >> "sink.partitioner" > >> >> > > > >> >> > > What do you think? > >> >> > > > >> >> > > Best, > >> >> > > Jingsong Lee > >> >> > > > >> >> > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <imj...@gmail.com > >> >> <mailto:imj...@gmail.com>> wrote: > >> >> > > > >> >> > > > Hi Kurt, > >> >> > > > > >> >> > > > That's good questions. > >> >> > > > > >> >> > > > > the meaning of "version" > >> >> > > > There are two versions in the old design. One is property > >> >> version > >> >> > > > "connector.property-version" which can be used for > backward > >> >> > > compatibility. > >> >> > > > The other one is "connector.version" which defines the > >> >> version of > >> >> > > external > >> >> > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. > >> >> > > > In this proposal, the "version" is the previous > >> >> "connector.version". > >> >> > The > >> >> > > > ""connector.property-version" is not introduced in new > >> >> design. > >> >> > > > > >> >> > > > > how to keep the old capability which can evolve > connector > >> >> properties > >> >> > > > The "connector.property-version" is an optional key in the > >> >> old design > >> >> > and > >> >> > > > is never bumped up. > >> >> > > > I'm not sure how "connector.property-version" should work > >> >> in the > >> >> > initial > >> >> > > > design. Maybe @Timo Walther <twal...@apache.org > >> >> <mailto:twal...@apache.org>> has more knowledge on > >> >> > > > this. > >> >> > > > But for the new properties, every options should be > >> >> expressed as > >> >> > > > `ConfigOption` which provides `withDeprecatedKeys(...)` > >> >> method to > >> >> > easily > >> >> > > > support evolving keys. > >> >> > > > > >> >> > > > > a single keys instead of two, e.g. "kafka-0.11", > >> >> "kafka-universal"? > >> >> > > > There are several benefit to use separate "version" key I > >> can > >> >> see: > >> >> > > > 1) it's more readable to separete them into different > keys, > >> >> because > >> >> > they > >> >> > > > are orthogonal concepts. > >> >> > > > 2) the planner can give all the availble versions in the > >> >> exception > >> >> > > message, > >> >> > > > if user uses a wrong version (this is often reported in > >> >> user ML). > >> >> > > > 3) If we use "kafka-0.11" as connector identifier, we may > >> >> have to > >> >> > write a > >> >> > > > full documentation for each version, because they are > >> >> different > >> >> > > > "connector"? > >> >> > > > IMO, for 0.11, 0.11, etc... kafka, they are actually > >> >> the same > >> >> > > connector > >> >> > > > but with different "client jar" version, > >> >> > > > they share all the same supported property keys and > >> >> should reside > >> >> > > > together. > >> >> > > > 4) IMO, the future vision is version-free. At some point > in > >> >> the future, > >> >> > > we > >> >> > > > may don't need users to specify kafka version anymore, and > >> >> make > >> >> > > > "version=universal" as optional or removed in the future. > >> >> This is can > >> >> > be > >> >> > > > done easily if they are separate keys. > >> >> > > > > >> >> > > > Best, > >> >> > > > Jark > >> >> > > > > >> >> > > > > >> >> > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young < > ykt...@gmail.com > >> >> <mailto:ykt...@gmail.com>> wrote: > >> >> > > > > >> >> > > > > Hi Jark, > >> >> > > > > > >> >> > > > > Thanks for the proposal, I'm +1 to the general idea. > >> >> However I have a > >> >> > > > > question about "version", > >> >> > > > > in the old design, the version seems to be aimed for > >> >> tracking > >> >> > property > >> >> > > > > version, with different > >> >> > > > > version, we could evolve these step by step without > >> >> breaking backward > >> >> > > > > compatibility. But in this > >> >> > > > > design, version is representing external system's > >> >> version, like > >> >> > "0.11" > >> >> > > > for > >> >> > > > > kafka, "6" or "7" for ES. > >> >> > > > > I'm not sure if this is necessary, what's the benefit of > >> >> using two > >> >> > keys > >> >> > > > > instead of one, like "kafka-0.11" > >> >> > > > > or "ES6" directly? And how about the old capability > which > >> >> could let > >> >> > us > >> >> > > > > evolving connector properties? > >> >> > > > > > >> >> > > > > Best, > >> >> > > > > Kurt > >> >> > > > > > >> >> > > > > > >> >> > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen > >> >> <shenleifight...@gmail.com <mailto:shenleifight...@gmail.com>> > >> >> > > > > wrote: > >> >> > > > > > >> >> > > > > > Hi Jark, > >> >> > > > > > > >> >> > > > > > I am really looking forward to this feature. I think > >> this > >> >> feature > >> >> > > > > > could simplify flink sql code,and at the same time , > >> >> > > > > > it could make the developer more easlier to config the > >> >> flink sql > >> >> > WITH > >> >> > > > > > options. > >> >> > > > > > > >> >> > > > > > Now when I am using flink sql to write flink task , > >> >> sometimes I > >> >> > think > >> >> > > > the > >> >> > > > > > WITH options is too long for user. > >> >> > > > > > For example,I config the kafka source connector > >> >> parameter,for > >> >> > > consumer > >> >> > > > > > group and brokers parameter: > >> >> > > > > > > >> >> > > > > > 'connector.properties.0.key' = 'group.id > >> >> <http://group.id>' > >> >> > > > > > > , 'connector.properties.0.value' = 'xxx' > >> >> > > > > > > , 'connector.properties.1.key' = > >> >> 'bootstrap.servers' > >> >> > > > > > > , 'connector.properties.1.value' = 'xxxxx' > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > I can understand this config , but for the flink fresh > >> >> man,maybe it > >> >> > > > > > is confused for him. > >> >> > > > > > In my thought, I am really looking forward to this > >> >> feature,thank > >> >> > you > >> >> > > to > >> >> > > > > > propose this feature. > >> >> > > > > > > >> >> > > > > > Best wishes, > >> >> > > > > > LakeShen > >> >> > > > > > > >> >> > > > > > > >> >> > > > > > Jark Wu <imj...@gmail.com <mailto:imj...@gmail.com>> > 于 > >> >> 2020年3月30日周一 下午2:02写道: > >> >> > > > > > > >> >> > > > > > > Hi everyone, > >> >> > > > > > > > >> >> > > > > > > I want to start a discussion about further improve > and > >> >> simplify > >> >> > our > >> >> > > > > > current > >> >> > > > > > > connector porperty keys, aka WITH options. > Currently, > >> >> we have a > >> >> > > > > > > 'connector.' prefix for many properties, but they > are > >> >> verbose, > >> >> > and > >> >> > > we > >> >> > > > > > see a > >> >> > > > > > > big inconsistency between the properties when > >> designing > >> >> FLIP-107. > >> >> > > > > > > > >> >> > > > > > > So we propose to remove all the 'connector.' prefix > >> and > >> >> rename > >> >> > > > > > > 'connector.type' to 'connector', 'format.type' to > >> >> 'format'. So a > >> >> > > new > >> >> > > > > > Kafka > >> >> > > > > > > DDL may look like this: > >> >> > > > > > > > >> >> > > > > > > CREATE TABLE kafka_table ( > >> >> > > > > > > ... > >> >> > > > > > > ) WITH ( > >> >> > > > > > > 'connector' = 'kafka', > >> >> > > > > > > 'version' = '0.10', > >> >> > > > > > > 'topic' = 'test-topic', > >> >> > > > > > > 'startup-mode' = 'earliest-offset', > >> >> > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', > >> >> > > > > > > 'properties.group.id <http://properties.group.id>' > = > >> >> 'testGroup', > >> >> > > > > > > 'format' = 'json', > >> >> > > > > > > 'format.fail-on-missing-field' = 'false' > >> >> > > > > > > ); > >> >> > > > > > > > >> >> > > > > > > The new connector property key set will come > together > >> >> with new > >> >> > > > Factory > >> >> > > > > > > inferface which is proposed in FLIP-95. Old > properties > >> >> are still > >> >> > > > > > compatible > >> >> > > > > > > with their existing implementation. New properties > >> >> are only > >> >> > > available > >> >> > > > > in > >> >> > > > > > > new DynamicTableFactory implementations. > >> >> > > > > > > > >> >> > > > > > > You can access the detailed FLIP here: > >> >> > > > > > > > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > >> >> > >> >> > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory > >> >> > > > > > > > >> >> > > > > > > Best, > >> >> > > > > > > Jark > >> >> > > > > > > > >> >> > > > > > > >> >> > > > > > >> >> > > > > >> >> > > > >> >> > > > >> >> > > -- > >> >> > > Best, Jingsong Lee > >> >> > > > >> >> > > >> >> > >> >> > >> >> > >> >> -- > >> >> > >> >> Benchao Li > >> >> School of Electronics Engineering and Computer Science, Peking > >> >> University > >> >> Tel:+86-15650713730 > >> >> Email:libenc...@gmail.com > >> >> <mailto:libenc...@gmail.com>;libenc...@pku.edu.cn > >> >> <mailto:libenc...@pku.edu.cn> > >> >> > >> > > >> > > >