Regarding to "property-version", even this is brought up by myself, but I'm actually not a fan of this. As a user, I think this property is too much for me to learn, and I probably won't use this anyway. If framework can follow a good procedure to upgrade properties, show some error messages (but still can work) when user used some deprecated properties, and then finally drop them 2-3 versions later. I think this is good enough.
Best, Kurt On Tue, Mar 31, 2020 at 9:07 AM Kurt Young <ykt...@gmail.com> wrote: > Regarding to version delimiter, I'm in favor of using "-", not "_". Not > only because we are using > "-" for config keys in other module, like CoreOptions, DeploymentOptions, > but also "-" is easier > to type than "_" which doesn't need to press "shift" ;-) > > Best, > Kurt > > > On Tue, Mar 31, 2020 at 12:37 AM Jark Wu <imj...@gmail.com> wrote: > >> Hi all, >> >> Thanks for the feedbacks. >> >> It seems that we have a conclusion to put the version into the factory >> identifier. I'm also fine with this. >> If we have this outcome, the interface of Factory#factoryVersion is not >> needed anymore, this can simplify the learning cost of new factory. >> We may need to update FLIP-95 and re-vote for it? cc @Timo Walther >> <twal...@apache.org> >> >> kafka => kafka for 0.11+ versions, we don't suffix "-universal", because >> the meaning of "universal" not easy to understand. >> kafka-0.11 => kafka for 0.11 version >> kafka-0.10 => kafka for 0.10 version >> elasticsearch-6 => elasticsearch for 6.x versions >> elasticsearch-7 => elasticsearch for 7.x versions >> hbase-1.4 => hbase for 1.4.x versions >> jdbc >> filesystem >> >> We use "-" as the version delimiter to make them to be more consistent. >> This is not forces, users can also use other delimiters or without >> delimiter. >> But this can be a guilde in the Javadoc of Factory, to make the connector >> ecosystem to be more consistent. >> >> What do you think? >> >> ------------------------ >> >> Regarding "connector.property-version": >> >> Hi @Dawid Wysakowicz <dwysakow...@apache.org> , the new fatories are >> designed not support to read current properties. >> All the current properties are routed to the old factories if they are >> using "connector.type". Otherwise, properties are routed to new factories. >> >> If I understand correctly, the "connector.property-version" is attched >> implicitly by system, not manually set by users. >> For example, the framework should add "connector.property-version=1" to >> properties when processing DDL statement. >> I'm fine to add a "connector.property-version=1" when processing DDL >> statement, but I think it's also fine if we don't, >> because this can be done in the future if need and the default version can >> be 1. >> >> Best, >> Jark >> >> On Tue, 31 Mar 2020 at 00:36, Jark Wu <imj...@gmail.com> wrote: >> >> > Hi all, >> > >> > Thanks for the feedbacks. >> > >> > It seems that we have a conclusion to put the version into the factory >> > identifier. I'm also fine with this. >> > If we have this outcome, the interface of Factory#factoryVersion is not >> > needed anymore, this can simplify the learning cost of new factory. >> > We may need to update FLIP-95 and re-vote for it? cc @Timo Walther >> > <twal...@apache.org> >> > >> > Btw, I would like to use "_" instead of "-" as the version delimiter, >> > because "-" looks like minus and may confuse users, e.g. >> "elasticsearch-6". >> > This is not forced, but should be a guilde in the Javadoc of Factory. >> > I propose to use the following identifiers for existing connectors, >> > >> > kafka => kafka for 0.11+ versions, we don't suffix "-universal", >> because >> > the meaning of "universal" not easy to understand. >> > kafka-0.11 => kafka for 0.11 version >> > kafka-0.10 => kafka for 0.10 version >> > elasticsearch-6 => elasticsearch for 6.x versions >> > elasticsearch-7 => elasticsearch for 7.x versions >> > hbase-1.4 => hbase for 1.4.x versions >> > jdbc >> > filesystem >> > >> > We use "-" as the version delimiter to make them to be more consistent. >> > This is not forces, users can also use other delimiters or without >> > delimiter. >> > But this can be a guilde in the Javadoc of Factory, to make the >> connector >> > ecosystem to be more consistent. >> > >> > What do you think? >> > >> > ------------------------ >> > >> > Regarding "connector.property-version": >> > >> > Hi @Dawid Wysakowicz <dwysakow...@apache.org> , the new fatories are >> > designed not support to read current properties. >> > All the current properties are routed to the old factories if they are >> > using "connector.type". Otherwise, properties are routed to new >> factories. >> > >> > If I understand correctly, the "connector.property-version" is attched >> > implicitly by system, not manually set by users. >> > For example, the framework should add "connector.property-version=1" to >> > properties when processing DDL statement. >> > I'm fine to add a "connector.property-version=1" when processing DDL >> > statement, but I think it's also fine if we don't, >> > because this can be done in the future if need and the default version >> can >> > be 1. >> > >> > Best, >> > Jark >> > >> > >> > >> > >> > >> > On Mon, 30 Mar 2020 at 23:21, Dawid Wysakowicz <dwysakow...@apache.org> >> > wrote: >> > >> >> Hi all, >> >> >> >> I like the overall design of the FLIP. >> >> >> >> As for the withstanding concerns. I kind of like the approach to put >> the >> >> version into the factory identifier. I think it's the cleanest way to >> >> say that this version actually applies to the connector itself and not >> >> to the system it connects to. BTW, I think the outcome of this >> >> discussion will affect interfaces described in FLIP-95. If we put the >> >> version into the functionIdentifier, do we need Factory#factoryVersion? >> >> >> >> I also think it does make sense to have a versioning for the properties >> >> as well. Are we able to read all the current properties with the new >> >> factories? I think we could use the "connector.property-version" to >> >> alternate between different Factory interfaces to support the old set >> of >> >> properties. Otherwise the new factories need to understand both set of >> >> properties, don't they? >> >> >> >> Best, >> >> >> >> Dawid >> >> >> >> On 30/03/2020 17:07, Timo Walther wrote: >> >> > Hi Jark, >> >> > >> >> > thanks for the FLIP. I don't have a strong opinion on >> >> > DynamicTableFactory#factoryVersion() for distiguishing connector >> >> > versions. We can also include it in the factory identifier itself. >> For >> >> > Kafka, I would then just use `kafka` because the naming "universal" >> >> > was just a helper solution at that time. >> >> > >> >> > What we need is a good guide for how to design the options. We should >> >> > include that in the JavaDoc of factory interface itself, once it is >> >> > in-place. I like the difference between source and sink in >> properties. >> >> > >> >> > Regarding "connector.property-version": >> >> > >> >> > It was meant for backwards compatibility along the dimension of >> >> > "property design" compared to "connector version". However, since the >> >> > connector is now responsible for parsing all properties, it is not as >> >> > necessary as it was before. >> >> > >> >> > However, a change of the property design as it is done in FLIP-107 >> >> > becomes more difficult without a property version and versioning is >> >> > always a good idea in API world. >> >> > >> >> > Regards, >> >> > Timo >> >> > >> >> > >> >> > On 30.03.20 16:38, Benchao Li wrote: >> >> >> Hi Jark, >> >> >> >> >> >> Thanks for starting the discussion. The FLIP LGTM generally. >> >> >> >> >> >> Regarding connector version VS put version into connector's name, >> >> >> I favor the latter personally, using one property to locate a >> >> >> connector can make the error log more precise. >> >> >> From the users' side, using one property to match a connector will >> >> >> be easier. Especially we have many connectors, >> >> >> and some of the need version property required, and some of them >> not. >> >> >> >> >> >> Regarding Jingsong's suggestion, >> >> >> IMO, it's a very good complement to the FLIP. Distinguishing >> >> >> properties for source and sink can be very useful, and >> >> >> also this will make the connector property more precise. >> >> >> We are also sick of this for now, we cannot know whether a DDL is a >> >> >> source or sink unless we look through all queries where >> >> >> the table is used. >> >> >> Even more, some of the required properties are only required for >> >> >> source, bug we cannot leave it blank for sink, and vice versa. >> >> >> I think we can also add a type for dimension tables except source >> and >> >> >> sink. >> >> >> >> >> >> Kurt Young <ykt...@gmail.com <mailto:ykt...@gmail.com>> 于2020年3月30日 >> >> >> 周一 下午8:16写道: >> >> >> >> >> >> > It's not possible for the framework to throw such exception. >> >> >> Because the >> >> >> framework doesn't know what versions do the connector support. >> >> >> >> >> >> Not really, we can list all valid connectors framework could >> >> >> found. E.g. >> >> >> user mistyped 'kafka-0.x', the error message will looks like: >> >> >> >> >> >> we don't have any connector named "kafka-0.x", but we have: >> >> >> FileSystem >> >> >> Kafka-0.10 >> >> >> Kafka-0.11 >> >> >> ElasticSearch6 >> >> >> ElasticSearch7 >> >> >> >> >> >> Best, >> >> >> Kurt >> >> >> >> >> >> >> >> >> On Mon, Mar 30, 2020 at 5:11 PM Jark Wu <imj...@gmail.com >> >> >> <mailto:imj...@gmail.com>> wrote: >> >> >> >> >> >> > Hi Kurt, >> >> >> > >> >> >> > > 2) Lists all available connectors seems also quite >> >> >> straightforward, e.g >> >> >> > user provided a wrong "kafka-0.8", we tell user we have >> >> >> candidates of >> >> >> > "kakfa-0.11", "kafka-universal" >> >> >> > It's not possible for the framework to throw such exception. >> >> >> Because the >> >> >> > framework doesn't know what versions do the connector >> support. >> >> >> All the >> >> >> > version information is a blackbox in the identifier. But with >> >> >> > `Factory#factoryVersion()` interface, we can know all the >> >> >> supported >> >> >> > versions. >> >> >> > >> >> >> > > 3) I don't think so. We can still treat it as the same >> >> >> connector but with >> >> >> > different versions. >> >> >> > That's true but that's weird. Because from the plain DDL >> >> >> definition, they >> >> >> > look like different connectors with different "connector" >> >> >> value, e.g. >> >> >> > 'connector=kafka-0.8', 'connector=kafka-0.10'. >> >> >> > >> >> >> > > If users don't set any version, we will use >> "kafka-universal" >> >> >> instead. >> >> >> > The behavior is inconsistent IMO. >> >> >> > That is a long term vision when there is no kafka clusters >> >> >> with <0.11 >> >> >> > version. >> >> >> > At that point, "universal" is the only supported version in >> >> Flink >> >> >> and the >> >> >> > "version" key can be optional. >> >> >> > >> >> >> > --------------------------------------------- >> >> >> > >> >> >> > Hi Jingsong, >> >> >> > >> >> >> > > "version" vs "kafka.version" >> >> >> > I though about it. But if we prefix "kafka" to version, we >> >> should >> >> >> prefix >> >> >> > "kafka" for all other property keys, because they are all >> kafka >> >> >> specific >> >> >> > options. >> >> >> > However, that will make the property set verbose, see >> rejected >> >> >> option#2 in >> >> >> > the FLIP. >> >> >> > >> >> >> > > explicitly separate options for source and sink >> >> >> > That's a good topic. It's good to have a guideline for the >> new >> >> >> property >> >> >> > keys. >> >> >> > I'm fine to prefix with a "source"/"sink" for some connector >> >> >> keys. >> >> >> > Actually, we already do this in some connectors, e.g. jdbc >> and >> >> >> hbase. >> >> >> > >> >> >> > Best, >> >> >> > Jark >> >> >> > >> >> >> > On Mon, 30 Mar 2020 at 16:36, Jingsong Li < >> >> jingsongl...@gmail.com >> >> >> <mailto:jingsongl...@gmail.com>> wrote: >> >> >> > >> >> >> > > Thanks Jark for the proposal. >> >> >> > > >> >> >> > > +1 to the general idea. >> >> >> > > >> >> >> > > For "version", what about "kafka.version"? It is obvious to >> >> >> know its >> >> >> > > meaning. >> >> >> > > >> >> >> > > And I'd like to start a new topic: >> >> >> > > Should we need to explicitly separate source from sink? >> >> >> > > With the development of batch and streaming, more and more >> >> >> connectors >> >> >> > have >> >> >> > > both source and sink. >> >> >> > > >> >> >> > > So should we set a rule for table properties: >> >> >> > > - properties for both source and sink: without prefix, like >> >> >> "topic" >> >> >> > > - properties for source only: with "source." prefix, like >> >> >> > > "source.startup-mode" >> >> >> > > - properties for sink only: with "sink." prefix, like >> >> >> "sink.partitioner" >> >> >> > > >> >> >> > > What do you think? >> >> >> > > >> >> >> > > Best, >> >> >> > > Jingsong Lee >> >> >> > > >> >> >> > > On Mon, Mar 30, 2020 at 3:56 PM Jark Wu <imj...@gmail.com >> >> >> <mailto:imj...@gmail.com>> wrote: >> >> >> > > >> >> >> > > > Hi Kurt, >> >> >> > > > >> >> >> > > > That's good questions. >> >> >> > > > >> >> >> > > > > the meaning of "version" >> >> >> > > > There are two versions in the old design. One is property >> >> >> version >> >> >> > > > "connector.property-version" which can be used for >> backward >> >> >> > > compatibility. >> >> >> > > > The other one is "connector.version" which defines the >> >> >> version of >> >> >> > > external >> >> >> > > > system, e.g. 0.11" for kafka, "6" or "7" for ES. >> >> >> > > > In this proposal, the "version" is the previous >> >> >> "connector.version". >> >> >> > The >> >> >> > > > ""connector.property-version" is not introduced in new >> >> >> design. >> >> >> > > > >> >> >> > > > > how to keep the old capability which can evolve >> connector >> >> >> properties >> >> >> > > > The "connector.property-version" is an optional key in >> the >> >> >> old design >> >> >> > and >> >> >> > > > is never bumped up. >> >> >> > > > I'm not sure how "connector.property-version" should work >> >> >> in the >> >> >> > initial >> >> >> > > > design. Maybe @Timo Walther <twal...@apache.org >> >> >> <mailto:twal...@apache.org>> has more knowledge on >> >> >> > > > this. >> >> >> > > > But for the new properties, every options should be >> >> >> expressed as >> >> >> > > > `ConfigOption` which provides `withDeprecatedKeys(...)` >> >> >> method to >> >> >> > easily >> >> >> > > > support evolving keys. >> >> >> > > > >> >> >> > > > > a single keys instead of two, e.g. "kafka-0.11", >> >> >> "kafka-universal"? >> >> >> > > > There are several benefit to use separate "version" key I >> >> can >> >> >> see: >> >> >> > > > 1) it's more readable to separete them into different >> keys, >> >> >> because >> >> >> > they >> >> >> > > > are orthogonal concepts. >> >> >> > > > 2) the planner can give all the availble versions in the >> >> >> exception >> >> >> > > message, >> >> >> > > > if user uses a wrong version (this is often reported in >> >> >> user ML). >> >> >> > > > 3) If we use "kafka-0.11" as connector identifier, we may >> >> >> have to >> >> >> > write a >> >> >> > > > full documentation for each version, because they are >> >> >> different >> >> >> > > > "connector"? >> >> >> > > > IMO, for 0.11, 0.11, etc... kafka, they are actually >> >> >> the same >> >> >> > > connector >> >> >> > > > but with different "client jar" version, >> >> >> > > > they share all the same supported property keys and >> >> >> should reside >> >> >> > > > together. >> >> >> > > > 4) IMO, the future vision is version-free. At some point >> in >> >> >> the future, >> >> >> > > we >> >> >> > > > may don't need users to specify kafka version anymore, >> and >> >> >> make >> >> >> > > > "version=universal" as optional or removed in the future. >> >> >> This is can >> >> >> > be >> >> >> > > > done easily if they are separate keys. >> >> >> > > > >> >> >> > > > Best, >> >> >> > > > Jark >> >> >> > > > >> >> >> > > > >> >> >> > > > On Mon, 30 Mar 2020 at 14:45, Kurt Young < >> ykt...@gmail.com >> >> >> <mailto:ykt...@gmail.com>> wrote: >> >> >> > > > >> >> >> > > > > Hi Jark, >> >> >> > > > > >> >> >> > > > > Thanks for the proposal, I'm +1 to the general idea. >> >> >> However I have a >> >> >> > > > > question about "version", >> >> >> > > > > in the old design, the version seems to be aimed for >> >> >> tracking >> >> >> > property >> >> >> > > > > version, with different >> >> >> > > > > version, we could evolve these step by step without >> >> >> breaking backward >> >> >> > > > > compatibility. But in this >> >> >> > > > > design, version is representing external system's >> >> >> version, like >> >> >> > "0.11" >> >> >> > > > for >> >> >> > > > > kafka, "6" or "7" for ES. >> >> >> > > > > I'm not sure if this is necessary, what's the benefit >> of >> >> >> using two >> >> >> > keys >> >> >> > > > > instead of one, like "kafka-0.11" >> >> >> > > > > or "ES6" directly? And how about the old capability >> which >> >> >> could let >> >> >> > us >> >> >> > > > > evolving connector properties? >> >> >> > > > > >> >> >> > > > > Best, >> >> >> > > > > Kurt >> >> >> > > > > >> >> >> > > > > >> >> >> > > > > On Mon, Mar 30, 2020 at 2:36 PM LakeShen >> >> >> <shenleifight...@gmail.com <mailto:shenleifight...@gmail.com>> >> >> >> > > > > wrote: >> >> >> > > > > >> >> >> > > > > > Hi Jark, >> >> >> > > > > > >> >> >> > > > > > I am really looking forward to this feature. I think >> >> this >> >> >> feature >> >> >> > > > > > could simplify flink sql code,and at the same time , >> >> >> > > > > > it could make the developer more easlier to config >> the >> >> >> flink sql >> >> >> > WITH >> >> >> > > > > > options. >> >> >> > > > > > >> >> >> > > > > > Now when I am using flink sql to write flink task , >> >> >> sometimes I >> >> >> > think >> >> >> > > > the >> >> >> > > > > > WITH options is too long for user. >> >> >> > > > > > For example,I config the kafka source connector >> >> >> parameter,for >> >> >> > > consumer >> >> >> > > > > > group and brokers parameter: >> >> >> > > > > > >> >> >> > > > > > 'connector.properties.0.key' = 'group.id >> >> >> <http://group.id>' >> >> >> > > > > > > , 'connector.properties.0.value' = 'xxx' >> >> >> > > > > > > , 'connector.properties.1.key' = >> >> >> 'bootstrap.servers' >> >> >> > > > > > > , 'connector.properties.1.value' = 'xxxxx' >> >> >> > > > > > > >> >> >> > > > > > >> >> >> > > > > > I can understand this config , but for the flink >> fresh >> >> >> man,maybe it >> >> >> > > > > > is confused for him. >> >> >> > > > > > In my thought, I am really looking forward to this >> >> >> feature,thank >> >> >> > you >> >> >> > > to >> >> >> > > > > > propose this feature. >> >> >> > > > > > >> >> >> > > > > > Best wishes, >> >> >> > > > > > LakeShen >> >> >> > > > > > >> >> >> > > > > > >> >> >> > > > > > Jark Wu <imj...@gmail.com <mailto:imj...@gmail.com>> >> 于 >> >> >> 2020年3月30日周一 下午2:02写道: >> >> >> > > > > > >> >> >> > > > > > > Hi everyone, >> >> >> > > > > > > >> >> >> > > > > > > I want to start a discussion about further improve >> and >> >> >> simplify >> >> >> > our >> >> >> > > > > > current >> >> >> > > > > > > connector porperty keys, aka WITH options. >> Currently, >> >> >> we have a >> >> >> > > > > > > 'connector.' prefix for many properties, but they >> are >> >> >> verbose, >> >> >> > and >> >> >> > > we >> >> >> > > > > > see a >> >> >> > > > > > > big inconsistency between the properties when >> >> designing >> >> >> FLIP-107. >> >> >> > > > > > > >> >> >> > > > > > > So we propose to remove all the 'connector.' prefix >> >> and >> >> >> rename >> >> >> > > > > > > 'connector.type' to 'connector', 'format.type' to >> >> >> 'format'. So a >> >> >> > > new >> >> >> > > > > > Kafka >> >> >> > > > > > > DDL may look like this: >> >> >> > > > > > > >> >> >> > > > > > > CREATE TABLE kafka_table ( >> >> >> > > > > > > ... >> >> >> > > > > > > ) WITH ( >> >> >> > > > > > > 'connector' = 'kafka', >> >> >> > > > > > > 'version' = '0.10', >> >> >> > > > > > > 'topic' = 'test-topic', >> >> >> > > > > > > 'startup-mode' = 'earliest-offset', >> >> >> > > > > > > 'properties.bootstrap.servers' = 'localhost:9092', >> >> >> > > > > > > 'properties.group.id <http://properties.group.id>' >> = >> >> >> 'testGroup', >> >> >> > > > > > > 'format' = 'json', >> >> >> > > > > > > 'format.fail-on-missing-field' = 'false' >> >> >> > > > > > > ); >> >> >> > > > > > > >> >> >> > > > > > > The new connector property key set will come >> together >> >> >> with new >> >> >> > > > Factory >> >> >> > > > > > > inferface which is proposed in FLIP-95. Old >> properties >> >> >> are still >> >> >> > > > > > compatible >> >> >> > > > > > > with their existing implementation. New properties >> >> >> are only >> >> >> > > available >> >> >> > > > > in >> >> >> > > > > > > new DynamicTableFactory implementations. >> >> >> > > > > > > >> >> >> > > > > > > You can access the detailed FLIP here: >> >> >> > > > > > > >> >> >> > > > > > > >> >> >> > > > > > >> >> >> > > > > >> >> >> > > > >> >> >> > > >> >> >> > >> >> >> >> >> >> >> >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-122%3A+New+Connector+Property+Keys+for+New+Factory >> >> >> > > > > > > >> >> >> > > > > > > Best, >> >> >> > > > > > > Jark >> >> >> > > > > > > >> >> >> > > > > > >> >> >> > > > > >> >> >> > > > >> >> >> > > >> >> >> > > >> >> >> > > -- >> >> >> > > Best, Jingsong Lee >> >> >> > > >> >> >> > >> >> >> >> >> >> >> >> >> >> >> >> -- >> >> >> >> >> >> Benchao Li >> >> >> School of Electronics Engineering and Computer Science, Peking >> >> >> University >> >> >> Tel:+86-15650713730 >> >> >> Email:libenc...@gmail.com >> >> >> <mailto:libenc...@gmail.com>;libenc...@pku.edu.cn >> >> >> <mailto:libenc...@pku.edu.cn> >> >> >> >> >> > >> >> >> > >> >