Thanks Jark for the feedback ~ I actually have a discussion offline with Timo and we think the black-list options has implicit rick with the growing new table options, a black-list there means all the new introduced options are default to be configurable dynamically, if the user forget to add it into the black-list, that would be a risk, what do you think about this @Timo ?
Jark Wu <imj...@gmail.com> 于2020年3月26日周四 下午5:29写道: > Hi Danny, > > Regarding to `supportedHintOptions()` interface, I suggest to use the > inverted version, `unsupportedHintOptions()`. > Because I think the disallowed list is much smaller. > In addition, it's hard to list all the properties under > `connector.properties.*`. > But we know `connector.properties.bootstrap.servers` and > `connector.properties.zookeeper.connect` are the only security options. > > Best, > Jark > > On Thu, 26 Mar 2020 at 16:47, Kurt Young <ykt...@gmail.com> wrote: > > > Hi Danny, > > > > Thanks for the updates. I have 2 comments regarding to latest document: > > > > 1) I think we also need `*supportedHintOptions*` for > > `*TableFormatFactory*` > > 2) IMO "dynamic-table-options.enabled" should belong to ` > > *OptimizerConfigOptions*` > > > > Best, > > Kurt > > > > > > On Thu, Mar 26, 2020 at 4:40 PM Timo Walther <twal...@apache.org> wrote: > > > > > Thanks for the update Danny. +1 for this proposal. > > > > > > Regards, > > > Timo > > > > > > On 26.03.20 04:51, Danny Chan wrote: > > > > Thanks everyone who engaged in this discussion ~ > > > > > > > > Our goal is "Supports Dynamic Table Options for Flink SQL". After an > > > > offline discussion with Kurt, Timo and Dawid, we have made the final > > > > conclusion, here is the summary: > > > > > > > > > > > > - Use comment style syntax to specify the dynamic table options: > > "/*+ > > > > *OPTIONS*(k1='v1', k2='v2') */" > > > > - Have constraint on the options keys: the options that may bring > > in > > > > security problems should not be allowed, i.e. Kafka connector > > > zookeeper > > > > endpoint URL and topic name > > > > - Use white-list to control the allowed options for each > connector, > > > > which is more safe for future extention > > > > - We allow to enable/disable this feature globally > > > > - Implement based on the current code base first, and when > FLIP-95 > > is > > > > checked in, implement this feature based on new interface > > > > > > > > Any suggestions are appreciated ~ > > > > > > > > [1] > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+Supports+Dynamic+Table+Options+for+Flink+SQL > > > > > > > > Best, > > > > Danny Chan > > > > > > > > Jark Wu <imj...@gmail.com> 于2020年3月18日周三 下午10:38写道: > > > > > > > >> Hi everyone, > > > >> > > > >> Sorry, but I'm not sure about the `supportedHintOptions`. I'm afraid > > it > > > >> doesn't solve the problems but increases some development and > learning > > > >> burdens. > > > >> > > > >> # increase development and learning burden > > > >> > > > >> According to the discussion so far, we want to support overriding a > > > subset > > > >> of options in hints which doesn't affect semantics. > > > >> With the `supportedHintOptions`, it's up to the connector developers > > to > > > >> decide which options will not affect semantics, and to be hint > > options. > > > >> However, the question is how to distinguish whether an option will > > > *affect > > > >> semantics*? What happens if an option will affect semantics but > > > provided as > > > >> hint options? > > > >> From my point of view, it's not easy to distinguish. For example, > the > > > >> "format.ignore-parse-error" can be a very useful dynamic option but > > that > > > >> will affect semantic, because the result is different (null vs > > > exception). > > > >> Another example, the "connector.lookup.cache.*" options are also > very > > > >> useful to tune jobs, however, it will also affect the job results. I > > can > > > >> come up many more useful options but may affect semantics. > > > >> > > > >> I can see that the community will under endless discussion around > "can > > > this > > > >> option to be a hint option?", "wether this option will affect > > > semantics?". > > > >> You can also find that we already have different opinions on > > > >> "ignore-parse-error". Those discussion is a waste of time! That's > not > > > what > > > >> users want! > > > >> The problem is user need this, this, this options and HOW to expose > > > them? > > > >> We should focus on that. > > > >> > > > >> Then there could be two endings in the future: > > > >> 1) compromise on the usability, we drop the rule that hints don't > > affect > > > >> semantics, allow all the useful options in the hints list. > > > >> 2) stick on the rule, users will find this is a stumbling feature > > which > > > >> doesn't solve their problems. > > > >> And they will be surprised why this option can't be set, but > the > > > other > > > >> could. *semantic* is hard to be understood by users. > > > >> > > > >> # doesn't solve the problems > > > >> > > > >> I think the purpose of this FLIP is to allow users to quickly > override > > > some > > > >> connectors' properties to tune their jobs. > > > >> However, `supportedHintOptions` is off track. It only allows a > subset > > > >> options and for the users it's not *clear* which subset is allowed. > > > >> > > > >> Besides, I'm not sure `supportedHintOptions` can work well for all > > > cases. > > > >> How could you support kafka properties (`connector.properties.*`) as > > > hint > > > >> options? Some kafka properties may affect semantics > > (bootstrap.servers), > > > >> some may not (max.poll.records). Besides, I think it's not possible > to > > > list > > > >> all the possible kafka properties [1]. > > > >> > > > >> In summary, IMO, `supportedHintOptions` > > > >> (1) it increase the complexity to develop a connector > > > >> (2) it confuses users which options can be used in hint, which are > > not, > > > >> they have to check the docs again and again. > > > >> (3) it doesn't solve the problems which we want to solve by this > FLIP. > > > >> > > > >> I think we should avoid introducing some partial solutions. > Otherwise, > > > we > > > >> will be stuck in a loop that introduce new API -> deprecate API -> > > > >> introduce new API.... > > > >> > > > >> I personally in favor of an explicit WITH syntax after the table as > a > > > part > > > >> of the query which is mentioned by Kurt before, e.g. SELECT * from T > > > >> WITH('key' = 'value') . > > > >> It allows users to dynamically set options which can affect > semantics. > > > It > > > >> will be very flexible to solve users' problems so far. > > > >> > > > >> Best, > > > >> Jark > > > >> > > > >> [1]: https://kafka.apache.org/documentation/#consumerconfigs > > > >> > > > >> On Wed, 18 Mar 2020 at 21:44, Danny Chan <yuzhao....@gmail.com> > > wrote: > > > >> > > > >>> My POC is here for the hints options merge [1]. > > > >>> > > > >>> Personally, I have no strong objections for splitting hints with > the > > > >>> CatalogTable, the only cons is a more complex implementation but > the > > > >>> concept is more clear, and I have updated the WIKI. > > > >>> > > > >>> I think it would be nice if we can support the format “ignore-parse > > > >> error” > > > >>> option key, the CSV source already has a key [2] and we can use > that > > in > > > >> the > > > >>> supportedHIntOptions, for the common CSV and JSON formats, we cal > > also > > > >> give > > > >>> a support. This is the only kind of key in formats that “do not > > change > > > >> the > > > >>> semantics” (somehow), what do you think about this ~ > > > >>> > > > >>> [1] > > > >>> > > > >> > > > > > > https://github.com/danny0405/flink/commit/5d925fa16c3c553423c4b7d93001521b8e6e6bee#diff-6e569a6dd124fd2091c18e2790fb49c5 > > > >>> [2] > > > >>> > > > >> > > > > > > https://github.com/apache/flink/blob/b83060dff6d403b6994b6646b3f29a374f599530/flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/table/sources/CsvTableSourceFactoryBase.java#L92 > > > >>> > > > >>> Best, > > > >>> Danny Chan > > > >>> 在 2020年3月18日 +0800 PM9:10,Timo Walther <twal...@apache.org>,写道: > > > >>>> Hi everyone, > > > >>>> > > > >>>> +1 to Kurt's suggestion. Let's just have it in source and sink > > > >> factories > > > >>>> for now. We can still move this method up in the future. > Currently, > > I > > > >>>> don't see a need for catalogs or formats. Because how would you > > target > > > >> a > > > >>>> format in the query? > > > >>>> > > > >>>> @Danny: Can you send a link to your PoC? I'm very skeptical about > > > >>>> creating a new CatalogTable in planner. Actually CatalogTable > should > > > be > > > >>>> immutable between Catalog and Factory. Because a catalog can > return > > > its > > > >>>> own factory and fully control the instantiation. Depending on the > > > >>>> implementation, that means it can be possible that the catalog has > > > >>>> encoded more information in a concrete subclass implementing the > > > >>>> interface. I vote for separating the concerns of catalog > information > > > >> and > > > >>>> hints in the factory explicitly. > > > >>>> > > > >>>> Regards, > > > >>>> Timo > > > >>>> > > > >>>> > > > >>>> On 18.03.20 05:41, Jingsong Li wrote: > > > >>>>> Hi, > > > >>>>> > > > >>>>> I am thinking we can provide hints to *table* related instances. > > > >>>>> - TableFormatFactory: of cause we need hints support, there are > > many > > > >>> format > > > >>>>> options in DDL too. > > > >>>>> - catalog and module: I don't know, maybe in future we can > provide > > > >> some > > > >>>>> hints for them. > > > >>>>> > > > >>>>> Best, > > > >>>>> Jingsong Lee > > > >>>>> > > > >>>>> On Wed, Mar 18, 2020 at 12:28 PM Danny Chan < > yuzhao....@gmail.com> > > > >>> wrote: > > > >>>>> > > > >>>>>> Yes, I think we should move the `supportedHintOptions` from > > > >>> TableFactory > > > >>>>>> to TableSourceFactory, and we also need to add the interface to > > > >>>>>> TableSinkFactory though because sink target table may also have > > > >> hints > > > >>>>>> attached. > > > >>>>>> > > > >>>>>> Best, > > > >>>>>> Danny Chan > > > >>>>>> 在 2020年3月18日 +0800 AM11:08,Kurt Young <ykt...@gmail.com>,写道: > > > >>>>>>> Have one question for adding `supportedHintOptions` method to > > > >>>>>>> `TableFactory`. It seems > > > >>>>>>> `TableFactory` is a base factory interface for all *table > module* > > > >>> related > > > >>>>>>> instances, such as > > > >>>>>>> catalog, module, format and so on. It's not created only for > > > >>> *table*. Is > > > >>>>>> it > > > >>>>>>> possible to move it > > > >>>>>>> to `TableSourceFactory`? > > > >>>>>>> > > > >>>>>>> Best, > > > >>>>>>> Kurt > > > >>>>>>> > > > >>>>>>> > > > >>>>>>> On Wed, Mar 18, 2020 at 10:59 AM Danny Chan < > > > >> yuzhao....@gmail.com> > > > >>>>>> wrote: > > > >>>>>>> > > > >>>>>>>> Thanks Timo ~ > > > >>>>>>>> > > > >>>>>>>> For the naming itself, I also think the PROPERTIES is not that > > > >>>>>> concise, so > > > >>>>>>>> +1 for OPTIONS (I had thought about that, but there are many > > > >>> codes in > > > >>>>>>>> current Flink called it properties, i.e. the > > > >>> DescriptorProperties, > > > >>>>>>>> #getSupportedProperties), let’s use OPTIONS if this is our new > > > >>>>>> preference. > > > >>>>>>>> > > > >>>>>>>> +1 to `Set<ConfigOption> supportedHintOptions()` because the > > > >>>>>> ConfigOption > > > >>>>>>>> can take more info. AFAIK, Spark also call their table options > > > >>> instead > > > >>>>>> of > > > >>>>>>>> properties. [1] > > > >>>>>>>> > > > >>>>>>>> In my local POC, I did create a new CatalogTable, and it works > > > >>> for > > > >>>>>> current > > > >>>>>>>> connectors well, all the DDL tables would finally yield a > > > >>> CatalogTable > > > >>>>>>>> instance and we can apply the options to that(in the > > > >>> CatalogSourceTable > > > >>>>>>>> when we generating the TableSource), the pros is that we do > not > > > >>> need to > > > >>>>>>>> modify the codes of connectors itself. If we split the options > > > >>> from > > > >>>>>>>> CatalogTable, we may need to add some additional logic in each > > > >>>>>> connector > > > >>>>>>>> factories in order to merge these properties (and the logic > are > > > >>> almost > > > >>>>>> the > > > >>>>>>>> same), what do you think about this? > > > >>>>>>>> > > > >>>>>>>> [1] > > > >>>>>>>> > > > >>>>>> > > > >>> > > > >> > > > > > > https://docs.databricks.com/spark/latest/spark-sql/language-manual/create-table.html > > > >>>>>>>> > > > >>>>>>>> Best, > > > >>>>>>>> Danny Chan > > > >>>>>>>> 在 2020年3月17日 +0800 PM10:10,Timo Walther <twal...@apache.org > > > >>> ,写道: > > > >>>>>>>>> Hi Danny, > > > >>>>>>>>> > > > >>>>>>>>> thanks for updating the FLIP. I think your current design is > > > >>>>>> sufficient > > > >>>>>>>>> to separate hints from result-related properties. > > > >>>>>>>>> > > > >>>>>>>>> One remark to the naming itself: I would vote for calling the > > > >>> hints > > > >>>>>>>>> around table scan `OPTIONS('k'='v')`. We used the term > > > >>> "properties" > > > >>>>>> in > > > >>>>>>>>> the past but since we want to unify the Flink configuration > > > >>>>>> experience, > > > >>>>>>>>> we should use consistent naming and classes around > > > >>> `ConfigOptions`. > > > >>>>>>>>> > > > >>>>>>>>> It would be nice to use `Set<ConfigOption> > > > >>> supportedHintOptions();` > > > >>>>>> to > > > >>>>>>>>> start using config options instead of pure string properties. > > > >>> This > > > >>>>>> will > > > >>>>>>>>> also allow us to generate documentation in the future around > > > >>>>>> supported > > > >>>>>>>>> data types, ranges, etc. for options. At some point we would > > > >>> also > > > >>>>>> like > > > >>>>>>>>> to drop `DescriptorProperties` class. "Options" is also used > > > >>> in the > > > >>>>>>>>> documentation [1] and in the SQL/MED standard [2]. > > > >>>>>>>>> > > > >>>>>>>>> Furthermore, I would still vote for separating CatalogTable > > > >>> and hint > > > >>>>>>>>> options. Otherwise the planner would need to create a new > > > >>>>>> CatalogTable > > > >>>>>>>>> instance which might not always be easy. We should offer them > > > >>> via: > > > >>>>>>>>> > > > >>>>>>>>> > > > >>> > org.apache.flink.table.factories.TableSourceFactory.Context#getHints: > > > >>>>>>>>> ReadableConfig > > > >>>>>>>>> > > > >>>>>>>>> What do you think? > > > >>>>>>>>> > > > >>>>>>>>> Regards, > > > >>>>>>>>> Timo > > > >>>>>>>>> > > > >>>>>>>>> [1] > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>> > > > >>> > > > >> > > > > > > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/create.html#create-table > > > >>>>>>>>> [2] https://wiki.postgresql.org/wiki/SQL/MED > > > >>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>>> On 12.03.20 15:06, Stephan Ewen wrote: > > > >>>>>>>>>> @Danny sounds good. > > > >>>>>>>>>> > > > >>>>>>>>>> Maybe it is worth listing all the classes of problems that > > > >>> you > > > >>>>>> want to > > > >>>>>>>>>> address and then look at each class and see if hints are a > > > >>> good > > > >>>>>> default > > > >>>>>>>>>> solution or a good optional way of simplifying things? > > > >>>>>>>>>> The discussion has grown a lot and it is starting to be > > > >> hard > > > >>> to > > > >>>>>>>> distinguish > > > >>>>>>>>>> the parts where everyone agrees from the parts were there > > > >> are > > > >>>>>> concerns. > > > >>>>>>>>>> > > > >>>>>>>>>> On Thu, Mar 12, 2020 at 2:31 PM Danny Chan < > > > >>> danny0...@apache.org> > > > >>>>>>>> wrote: > > > >>>>>>>>>> > > > >>>>>>>>>>> Thanks Stephan ~ > > > >>>>>>>>>>> > > > >>>>>>>>>>> We can remove the support for properties that may change > > > >>> the > > > >>>>>>>> semantics of > > > >>>>>>>>>>> query if you think that is a trouble. > > > >>>>>>>>>>> > > > >>>>>>>>>>> How about we support the /*+ properties() */ hint only > > > >> for > > > >>> those > > > >>>>>>>> optimize > > > >>>>>>>>>>> parameters, such as the fetch size of source or something > > > >>> like > > > >>>>>> that, > > > >>>>>>>> does > > > >>>>>>>>>>> that make sense? > > > >>>>>>>>>>> > > > >>>>>>>>>>> Stephan Ewen <se...@apache.org>于2020年3月12日 周四下午7:45写道: > > > >>>>>>>>>>> > > > >>>>>>>>>>>> I think Bowen has actually put it very well. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> (1) Hints that change semantics looks like trouble > > > >>> waiting to > > > >>>>>>>> happen. For > > > >>>>>>>>>>>> example Kafka offset handling should be in filters. The > > > >>> Kafka > > > >>>>>>>> source > > > >>>>>>>>>>> should > > > >>>>>>>>>>>> support predicate pushdown. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> (2) Hints should not be a workaround for current > > > >>> shortcomings. > > > >>>>>> A > > > >>>>>>>> lot of > > > >>>>>>>>>>> the > > > >>>>>>>>>>>> suggested above sounds exactly like that. Working > > > >> around > > > >>>>>>>> catalog/DDL > > > >>>>>>>>>>>> shortcomings, missing exposure of metadata (offsets), > > > >>> missing > > > >>>>>>>> predicate > > > >>>>>>>>>>>> pushdown in Kafka. Abusing a feature like hints now as > > > >> a > > > >>> quick > > > >>>>>> fix > > > >>>>>>>> for > > > >>>>>>>>>>>> these issues, rather than fixing the root causes, will > > > >>> much > > > >>>>>> likely > > > >>>>>>>> bite > > > >>>>>>>>>>> us > > > >>>>>>>>>>>> back badly in the future. > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> Best, > > > >>>>>>>>>>>> Stephan > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>>> On Thu, Mar 12, 2020 at 10:43 AM Kurt Young < > > > >>> ykt...@gmail.com> > > > >>>>>>>> wrote: > > > >>>>>>>>>>>> > > > >>>>>>>>>>>>> It seems this FLIP's name is somewhat misleading. > > > >> From > > > >>> my > > > >>>>>>>>>>> understanding, > > > >>>>>>>>>>>>> this FLIP is trying to > > > >>>>>>>>>>>>> address the dynamic parameter issue, and table hints > > > >>> is the > > > >>>>>> way > > > >>>>>>>> we wan > > > >>>>>>>>>>> to > > > >>>>>>>>>>>>> choose. I think we should > > > >>>>>>>>>>>>> be focus on "what's the right way to solve dynamic > > > >>> property" > > > >>>>>>>> instead of > > > >>>>>>>>>>>>> discussing "whether table > > > >>>>>>>>>>>>> hints can affect query semantics". > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> For now, there are two proposed ways to achieve > > > >> dynamic > > > >>>>>> property: > > > >>>>>>>>>>>>> 1. FLIP-110: create temporary table xx like xx with > > > >>> (xxx) > > > >>>>>>>>>>>>> 2. use custom "from t with (xxx)" syntax > > > >>>>>>>>>>>>> 3. "Borrow" the table hints to have a special > > > >>> PROPERTIES > > > >>>>>> hint. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> The first one didn't break anything, but the only > > > >>> problem i > > > >>>>>> see > > > >>>>>>>> is a > > > >>>>>>>>>>>> little > > > >>>>>>>>>>>>> more verbose than the table hint > > > >>>>>>>>>>>>> approach. I can imagine when someone using SQL CLI to > > > >>> have a > > > >>>>>> sql > > > >>>>>>>>>>>>> experience, it's quite often that > > > >>>>>>>>>>>>> he will modify the table property, some use cases i > > > >> can > > > >>>>>> think of: > > > >>>>>>>>>>>>> 1. the source contains some corrupted data, i want to > > > >>> turn > > > >>>>>> on the > > > >>>>>>>>>>>>> "ignore-error" flag for certain formats. > > > >>>>>>>>>>>>> 2. I have a kafka table and want to see some sample > > > >>> data > > > >>>>>> from the > > > >>>>>>>>>>>>> beginning, so i change the offset > > > >>>>>>>>>>>>> to "earliest", and then I want to observe the latest > > > >>> data > > > >>>>>> which > > > >>>>>>>> keeps > > > >>>>>>>>>>>>> coming in. I would write another query > > > >>>>>>>>>>>>> to select from the latest table. > > > >>>>>>>>>>>>> 3. I want to my jdbc sink flush data more eagerly > > > >> then > > > >>> i can > > > >>>>>>>> observe > > > >>>>>>>>>>> the > > > >>>>>>>>>>>>> data from database side. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> Most of such use cases are quite ad-hoc. If every > > > >> time > > > >>> I > > > >>>>>> want to > > > >>>>>>>> have a > > > >>>>>>>>>>>>> different experience, i need to create > > > >>>>>>>>>>>>> a temporary table and then also modify my query, it > > > >>> doesn't > > > >>>>>> feel > > > >>>>>>>>>>> smooth. > > > >>>>>>>>>>>>> Embed such dynamic property into > > > >>>>>>>>>>>>> query would have better user experience. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> Both 2 & 3 can make this happen. The cons of #2 is > > > >>> breaking > > > >>>>>> SQL > > > >>>>>>>>>>>> compliant, > > > >>>>>>>>>>>>> and for #3, it only breaks some > > > >>>>>>>>>>>>> unwritten rules, but we can have an explanation on > > > >>> that. And > > > >>>>>> I > > > >>>>>>>> really > > > >>>>>>>>>>>> doubt > > > >>>>>>>>>>>>> whether user would complain about > > > >>>>>>>>>>>>> this when they actually have flexible and good > > > >>> experience > > > >>>>>> using > > > >>>>>>>> this. > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> My tendency would be #3 > #1 > #2, what do you think? > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>> Kurt > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>> On Thu, Mar 12, 2020 at 1:11 PM Danny Chan < > > > >>>>>> yuzhao....@gmail.com > > > >>>>>>>>> > > > >>>>>>>>>>> wrote: > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Thanks Aljoscha ~ > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> I agree for most of the query hints, they are > > > >>> optional as > > > >>>>>> an > > > >>>>>>>>>>> optimizer > > > >>>>>>>>>>>>>> instruction, especially for the traditional RDBMS. > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> But, just like BenChao said, Flink as a computation > > > >>> engine > > > >>>>>> has > > > >>>>>>>> many > > > >>>>>>>>>>>>>> different kind of data sources, thus, dynamic > > > >>> parameters > > > >>>>>> like > > > >>>>>>>>>>>>> start_offest > > > >>>>>>>>>>>>>> can only bind to each table scope, we can not set a > > > >>> session > > > >>>>>>>> config > > > >>>>>>>>>>> like > > > >>>>>>>>>>>>>> KSQL because they are all about Kafka: > > > >>>>>>>>>>>>>>> SET ‘auto.offset.reset’=‘earliest’; > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Thus the most flexible way to set up these dynamic > > > >>> params > > > >>>>>> is > > > >>>>>>>> to bind > > > >>>>>>>>>>> to > > > >>>>>>>>>>>>>> the table scope in the query when we want to > > > >> override > > > >>>>>>>> something, so > > > >>>>>>>>>>> we > > > >>>>>>>>>>>>> have > > > >>>>>>>>>>>>>> these solutions above (with pros and cons from my > > > >>> side): > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> • 1. Select * from t(offset=123) (from Timo) > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Pros: > > > >>>>>>>>>>>>>> - Easy to add > > > >>>>>>>>>>>>>> - Parameters are part of the main query > > > >>>>>>>>>>>>>> Cons: > > > >>>>>>>>>>>>>> - Not SQL compliant > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> • 2. Select * from t /*+ PROPERTIES(offset=123) */ > > > >>> (from > > > >>>>>> me) > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Pros: > > > >>>>>>>>>>>>>> - Easy to add > > > >>>>>>>>>>>>>> - SQL compliant because it is nested in the > > > >> comments > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Cons: > > > >>>>>>>>>>>>>> - Parameters are not part of the main query > > > >>>>>>>>>>>>>> - Cryptic syntax for new users > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> The biggest problem for hints way may be the “if > > > >>> hints > > > >>>>>> must be > > > >>>>>>>>>>>> optional”, > > > >>>>>>>>>>>>>> actually we have though about 1 for a while but > > > >>> aborted > > > >>>>>>>> because it > > > >>>>>>>>>>>> breaks > > > >>>>>>>>>>>>>> the SQL standard too much. And we replace it with > > > >> 2, > > > >>>>>> because > > > >>>>>>>> the > > > >>>>>>>>>>> hints > > > >>>>>>>>>>>>>> syntax do not break SQL standard(nested in > > > >> comments). > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> What if we have the special /*+ PROPERTIES */ hint > > > >>> that > > > >>>>>> allows > > > >>>>>>>>>>> override > > > >>>>>>>>>>>>>> some properties of table dynamically, it does not > > > >>> break > > > >>>>>>>> anything, at > > > >>>>>>>>>>>>> lease > > > >>>>>>>>>>>>>> for current Flink use cases. > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Planner hints are optional just because they are > > > >>> naturally > > > >>>>>>>> enforcers > > > >>>>>>>>>>> of > > > >>>>>>>>>>>>>> the planner, most of them aim to instruct the > > > >>> optimizer, > > > >>>>>> but, > > > >>>>>>>> the > > > >>>>>>>>>>> table > > > >>>>>>>>>>>>>> hints is a little different, table hints can > > > >> specify > > > >>> the > > > >>>>>> table > > > >>>>>>>> meta > > > >>>>>>>>>>>> like > > > >>>>>>>>>>>>>> index column, and it is very convenient to specify > > > >>> table > > > >>>>>>>> properties. > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Or shall we not call /*+ PROPERTIES(offset=123) */ > > > >>> table > > > >>>>>> hint, > > > >>>>>>>> we > > > >>>>>>>>>>> can > > > >>>>>>>>>>>>>> call it table dynamic parameters. > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>> Danny Chan > > > >>>>>>>>>>>>>> 在 2020年3月11日 +0800 PM9:20,Aljoscha Krettek < > > > >>>>>>>> aljos...@apache.org>,写道: > > > >>>>>>>>>>>>>>> Hi, > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> I don't understand this discussion. Hints, as I > > > >>>>>> understand > > > >>>>>>>> them, > > > >>>>>>>>>>>> should > > > >>>>>>>>>>>>>>> work like this: > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> - hints are *optional* advice for the optimizer > > > >> to > > > >>> try > > > >>>>>> and > > > >>>>>>>> help it > > > >>>>>>>>>>> to > > > >>>>>>>>>>>>>>> find a good execution strategy > > > >>>>>>>>>>>>>>> - hints should not change query semantics, i.e. > > > >>> they > > > >>>>>> should > > > >>>>>>>> not > > > >>>>>>>>>>>> change > > > >>>>>>>>>>>>>>> connector properties executing a query with > > > >> taking > > > >>> into > > > >>>>>>>> account the > > > >>>>>>>>>>>>>>> hints *must* produce the same result as executing > > > >>> the > > > >>>>>> query > > > >>>>>>>> without > > > >>>>>>>>>>>>>>> taking into account the hints > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> From these simple requirements you can derive a > > > >>> solution > > > >>>>>>>> that makes > > > >>>>>>>>>>>>>>> sense. I don't have a strong preference for the > > > >>> syntax > > > >>>>>> but we > > > >>>>>>>>>>> should > > > >>>>>>>>>>>>>>> strive to be in line with prior work. > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>> Aljoscha > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> On 11.03.20 11:53, Danny Chan wrote: > > > >>>>>>>>>>>>>>>> Thanks Timo for summarize the 3 options ~ > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> I agree with Kurt that option2 is too > > > >>> complicated to > > > >>>>>> use > > > >>>>>>>> because: > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> • As a Kafka topic consumer, the user must > > > >>> define both > > > >>>>>> the > > > >>>>>>>>>>> virtual > > > >>>>>>>>>>>>>> column for start offset and he must apply a special > > > >>> filter > > > >>>>>>>> predicate > > > >>>>>>>>>>>>> after > > > >>>>>>>>>>>>>> each query > > > >>>>>>>>>>>>>>>> • And for the internal implementation, the > > > >>> metadata > > > >>>>>> column > > > >>>>>>>> push > > > >>>>>>>>>>>> down > > > >>>>>>>>>>>>>> is another hard topic, each kind of message queue > > > >>> may have > > > >>>>>> its > > > >>>>>>>> offset > > > >>>>>>>>>>>>>> attribute, we need to consider the expression type > > > >>> for > > > >>>>>>>> different > > > >>>>>>>>>>> kind; > > > >>>>>>>>>>>>> the > > > >>>>>>>>>>>>>> source also need to recognize the constant column > > > >> as > > > >>> a > > > >>>>>> config > > > >>>>>>>>>>>>> option(which > > > >>>>>>>>>>>>>> is weird because usually what we pushed down is a > > > >>> table > > > >>>>>> column) > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> For option 1 and option3, I think there is no > > > >>>>>> difference, > > > >>>>>>>> option1 > > > >>>>>>>>>>>> is > > > >>>>>>>>>>>>>> also a hint syntax which is introduced in Sybase > > > >> and > > > >>>>>>>> referenced then > > > >>>>>>>>>>>>>> deprecated by MS-SQL in 199X years because of the > > > >>>>>>>> ambitiousness. > > > >>>>>>>>>>>>> Personally > > > >>>>>>>>>>>>>> I prefer /*+ */ style table hint than WITH keyword > > > >>> for > > > >>>>>> these > > > >>>>>>>> reasons: > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> • We do not break the standard SQL, the hints > > > >> are > > > >>>>>> nested > > > >>>>>>>> in SQL > > > >>>>>>>>>>>>>> comments > > > >>>>>>>>>>>>>>>> • We do not need to introduce additional WITH > > > >>> keyword > > > >>>>>>>> which may > > > >>>>>>>>>>>>> appear > > > >>>>>>>>>>>>>> in a query if we use that because a table can be > > > >>>>>> referenced in > > > >>>>>>>> all > > > >>>>>>>>>>>> kinds > > > >>>>>>>>>>>>> of > > > >>>>>>>>>>>>>> SQL contexts: INSERT/DELETE/FROM/JOIN …. That would > > > >>> make > > > >>>>>> our > > > >>>>>>>> sql > > > >>>>>>>>>>> query > > > >>>>>>>>>>>>>> break too much of the SQL from standard > > > >>>>>>>>>>>>>>>> • We would have uniform syntax for hints as > > > >> query > > > >>>>>> hint, one > > > >>>>>>>>>>> syntax > > > >>>>>>>>>>>>>> fits all and more easy to use > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> And here is the reason why we choose a uniform > > > >>> Oracle > > > >>>>>>>> style query > > > >>>>>>>>>>>>>> hint syntax which is addressed by Julian Hyde when > > > >> we > > > >>>>>> design > > > >>>>>>>> the > > > >>>>>>>>>>> syntax > > > >>>>>>>>>>>>>> from the Calcite community: > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> I don’t much like the MSSQL-style syntax for > > > >>> table > > > >>>>>> hints. > > > >>>>>>>> It > > > >>>>>>>>>>> adds a > > > >>>>>>>>>>>>>> new use of the WITH keyword that is unrelated to > > > >> the > > > >>> use of > > > >>>>>>>> WITH for > > > >>>>>>>>>>>>>> common-table expressions. > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> A historical note. Microsoft SQL Server > > > >>> inherited its > > > >>>>>> hint > > > >>>>>>>> syntax > > > >>>>>>>>>>>>> from > > > >>>>>>>>>>>>>> Sybase a very long time ago. (See “Transact SQL > > > >>>>>>>> Programming”[1], page > > > >>>>>>>>>>>>> 632, > > > >>>>>>>>>>>>>> “Optimizer hints”. The book was written in 1999, > > > >> and > > > >>> covers > > > >>>>>>>> Microsoft > > > >>>>>>>>>>>> SQL > > > >>>>>>>>>>>>>> Server 6.5 / 7.0 and Sybase Adaptive Server 11.5, > > > >>> but the > > > >>>>>>>> syntax very > > > >>>>>>>>>>>>>> likely predates Sybase 4.3, from which Microsoft > > > >> SQL > > > >>>>>> Server was > > > >>>>>>>>>>> forked > > > >>>>>>>>>>>> in > > > >>>>>>>>>>>>>> 1993.) > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> Microsoft later added the WITH keyword to make > > > >>> it less > > > >>>>>>>> ambiguous, > > > >>>>>>>>>>>> and > > > >>>>>>>>>>>>>> has now deprecated the syntax that does not use > > > >> WITH. > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> They are forced to keep the syntax for > > > >> backwards > > > >>>>>>>> compatibility > > > >>>>>>>>>>> but > > > >>>>>>>>>>>>>> that doesn’t mean that we should shoulder their > > > >>> burden. > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> I think formatted comments are the right > > > >>> container for > > > >>>>>>>> hints > > > >>>>>>>>>>>> because > > > >>>>>>>>>>>>>> it allows us to change the hint syntax without > > > >>> changing > > > >>>>>> the SQL > > > >>>>>>>>>>> parser, > > > >>>>>>>>>>>>> and > > > >>>>>>>>>>>>>> makes clear that we are at liberty to ignore hints > > > >>>>>> entirely. > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> Julian > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> [1] https://www.amazon.com/s?k=9781565924017 < > > > >>>>>>>>>>>>>> https://www.amazon.com/s?k=9781565924017> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>> Danny Chan > > > >>>>>>>>>>>>>>>> 在 2020年3月11日 +0800 PM4:03,Timo Walther < > > > >>>>>> twal...@apache.org > > > >>>>>>>>> ,写道: > > > >>>>>>>>>>>>>>>>> Hi Danny, > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> it is true that our DDL is not standard > > > >>> compliant by > > > >>>>>>>> using the > > > >>>>>>>>>>>> WITH > > > >>>>>>>>>>>>>>>>> clause. Nevertheless, we aim for not > > > >> diverging > > > >>> too > > > >>>>>> much > > > >>>>>>>> and the > > > >>>>>>>>>>>>> LIKE > > > >>>>>>>>>>>>>>>>> clause is an example of that. It will solve > > > >>> things > > > >>>>>> like > > > >>>>>>>>>>>> overwriting > > > >>>>>>>>>>>>>>>>> WATERMARKs, add additional/modifying > > > >>> properties and > > > >>>>>>>> inherit > > > >>>>>>>>>>>> schema. > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Bowen is right that Flink's DDL is mixing 3 > > > >>> types > > > >>>>>>>> definition > > > >>>>>>>>>>>>>> together. > > > >>>>>>>>>>>>>>>>> We are not the first ones that try to solve > > > >>> this. > > > >>>>>> There > > > >>>>>>>> is also > > > >>>>>>>>>>>> the > > > >>>>>>>>>>>>>> SQL > > > >>>>>>>>>>>>>>>>> MED standard [1] that tried to tackle this > > > >>> problem. I > > > >>>>>>>> think it > > > >>>>>>>>>>>> was > > > >>>>>>>>>>>>>> not > > > >>>>>>>>>>>>>>>>> considered when designing the current DDL. > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Currently, I see 3 options for handling Kafka > > > >>>>>> offsets. I > > > >>>>>>>> will > > > >>>>>>>>>>>> give > > > >>>>>>>>>>>>>> some > > > >>>>>>>>>>>>>>>>> examples and look forward to feedback here: > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> *Option 1* Runtime and semantic parms as part > > > >>> of the > > > >>>>>>>> query > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> `SELECT * FROM MyTable('offset'=123)` > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Pros: > > > >>>>>>>>>>>>>>>>> - Easy to add > > > >>>>>>>>>>>>>>>>> - Parameters are part of the main query > > > >>>>>>>>>>>>>>>>> - No complicated hinting syntax > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Cons: > > > >>>>>>>>>>>>>>>>> - Not SQL compliant > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> *Option 2* Use metadata in query > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> `CREATE TABLE MyTable (id INT, offset AS > > > >>>>>>>>>>>>> SYSTEM_METADATA('offset'))` > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> `SELECT * FROM MyTable WHERE offset > > > > >> TIMESTAMP > > > >>>>>>>> '2012-12-12 > > > >>>>>>>>>>>>>> 12:34:22'` > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Pros: > > > >>>>>>>>>>>>>>>>> - SQL compliant in the query > > > >>>>>>>>>>>>>>>>> - Access of metadata in the DDL which is > > > >>> required > > > >>>>>> anyway > > > >>>>>>>>>>>>>>>>> - Regular pushdown rules apply > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Cons: > > > >>>>>>>>>>>>>>>>> - Users need to add an additional comlumn in > > > >>> the DDL > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> *Option 3*: Use hints for properties > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> ` > > > >>>>>>>>>>>>>>>>> SELECT * > > > >>>>>>>>>>>>>>>>> FROM MyTable /*+ PROPERTIES('offset'=123) */ > > > >>>>>>>>>>>>>>>>> ` > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Pros: > > > >>>>>>>>>>>>>>>>> - Easy to add > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Cons: > > > >>>>>>>>>>>>>>>>> - Parameters are not part of the main query > > > >>>>>>>>>>>>>>>>> - Cryptic syntax for new users > > > >>>>>>>>>>>>>>>>> - Not standard compliant. > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> If we go with this option, I would suggest to > > > >>> make it > > > >>>>>>>> available > > > >>>>>>>>>>>> in > > > >>>>>>>>>>>>> a > > > >>>>>>>>>>>>>>>>> separate map and don't mix it with statically > > > >>> defined > > > >>>>>>>>>>> properties. > > > >>>>>>>>>>>>>> Such > > > >>>>>>>>>>>>>>>>> that the factory can decide which properties > > > >>> have the > > > >>>>>>>> right to > > > >>>>>>>>>>> be > > > >>>>>>>>>>>>>>>>> overwritten by the hints: > > > >>>>>>>>>>>>>>>>> TableSourceFactory.Context.getQueryHints(): > > > >>>>>>>> ReadableConfig > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Regards, > > > >>>>>>>>>>>>>>>>> Timo > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> [1] https://en.wikipedia.org/wiki/SQL/MED > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> Currently I see 3 options as a > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> On 11.03.20 07:21, Danny Chan wrote: > > > >>>>>>>>>>>>>>>>>> Thanks Bowen ~ > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> I agree we should somehow categorize our > > > >>> connector > > > >>>>>>>>>>> parameters. > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> For type1, I’m already preparing a solution > > > >>> like > > > >>>>>> the > > > >>>>>>>>>>> Confluent > > > >>>>>>>>>>>>>> schema registry + Avro schema inference thing, so > > > >>> this may > > > >>>>>> not > > > >>>>>>>> be a > > > >>>>>>>>>>>>> problem > > > >>>>>>>>>>>>>> in the near future. > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> For type3, I have some questions: > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> "SELECT * FROM mykafka WHERE offset > > > > >> 12pm > > > >>>>>> yesterday” > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> Where does the offset column come from, a > > > >>> virtual > > > >>>>>>>> column from > > > >>>>>>>>>>>> the > > > >>>>>>>>>>>>>> table schema, you said that > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> They change > > > >>>>>>>>>>>>>>>>>> almost every time a query starts and have > > > >>> nothing > > > >>>>>> to > > > >>>>>>>> do with > > > >>>>>>>>>>>>>> metadata, thus > > > >>>>>>>>>>>>>>>>>> should not be part of table definition/DDL > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> But why you can reference it in the query, > > > >>> I’m > > > >>>>>>>> confused for > > > >>>>>>>>>>>> that, > > > >>>>>>>>>>>>>> can you elaborate a little ? > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>> Danny Chan > > > >>>>>>>>>>>>>>>>>> 在 2020年3月11日 +0800 PM12:52,Bowen Li < > > > >>>>>>>> bowenl...@gmail.com > > > >>>>>>>>>>>> ,写道: > > > >>>>>>>>>>>>>>>>>>> Thanks Danny for kicking off the effort > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> The root cause of too much manual work is > > > >>> Flink > > > >>>>>> DDL > > > >>>>>>>> has > > > >>>>>>>>>>>> mixed 3 > > > >>>>>>>>>>>>>> types of > > > >>>>>>>>>>>>>>>>>>> params together and doesn't handle each > > > >> of > > > >>> them > > > >>>>>> very > > > >>>>>>>> well. > > > >>>>>>>>>>>>> Below > > > >>>>>>>>>>>>>> are how I > > > >>>>>>>>>>>>>>>>>>> categorize them and corresponding > > > >>> solutions in my > > > >>>>>>>> mind: > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> - type 1: Metadata of external data, like > > > >>>>>> external > > > >>>>>>>>>>>>> endpoint/url, > > > >>>>>>>>>>>>>>>>>>> username/pwd, schemas, formats. > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> Such metadata are mostly already > > > >>> accessible in > > > >>>>>>>> external > > > >>>>>>>>>>>> system > > > >>>>>>>>>>>>>> as long as > > > >>>>>>>>>>>>>>>>>>> endpoints and credentials are provided. > > > >>> Flink can > > > >>>>>>>> get it > > > >>>>>>>>>>> thru > > > >>>>>>>>>>>>>> catalogs, but > > > >>>>>>>>>>>>>>>>>>> we haven't had many catalogs yet and thus > > > >>> Flink > > > >>>>>> just > > > >>>>>>>> hasn't > > > >>>>>>>>>>>>> been > > > >>>>>>>>>>>>>> able to > > > >>>>>>>>>>>>>>>>>>> leverage that. So the solution should be > > > >>> building > > > >>>>>>>> more > > > >>>>>>>>>>>>> catalogs. > > > >>>>>>>>>>>>>> Such > > > >>>>>>>>>>>>>>>>>>> params should be part of a Flink table > > > >>>>>>>> DDL/definition, and > > > >>>>>>>>>>>> not > > > >>>>>>>>>>>>>> overridable > > > >>>>>>>>>>>>>>>>>>> in any means. > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> - type 2: Runtime params, like jdbc > > > >>> connector's > > > >>>>>>>> fetch size, > > > >>>>>>>>>>>>>> elasticsearch > > > >>>>>>>>>>>>>>>>>>> connector's bulk flush size. > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> Such params don't affect query results, > > > >> but > > > >>>>>> affect > > > >>>>>>>> how > > > >>>>>>>>>>>> results > > > >>>>>>>>>>>>>> are produced > > > >>>>>>>>>>>>>>>>>>> (eg. fast or slow, aka performance) - > > > >> they > > > >>> are > > > >>>>>>>> essentially > > > >>>>>>>>>>>>>> execution and > > > >>>>>>>>>>>>>>>>>>> implementation details. They change often > > > >>> in > > > >>>>>>>> exploration or > > > >>>>>>>>>>>>>> development > > > >>>>>>>>>>>>>>>>>>> stages, but not quite frequently in > > > >>> well-defined > > > >>>>>>>>>>> long-running > > > >>>>>>>>>>>>>> pipelines. > > > >>>>>>>>>>>>>>>>>>> They should always have default values > > > >> and > > > >>> can be > > > >>>>>>>> missing > > > >>>>>>>>>>> in > > > >>>>>>>>>>>>>> query. They > > > >>>>>>>>>>>>>>>>>>> can be part of a table DDL/definition, > > > >> but > > > >>> should > > > >>>>>>>> also be > > > >>>>>>>>>>>>>> replaceable in a > > > >>>>>>>>>>>>>>>>>>> query - *this is what table "hints" in > > > >>> FLIP-113 > > > >>>>>>>> should > > > >>>>>>>>>>>> cover*. > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> - type 3: Semantic params, like kafka > > > >>> connector's > > > >>>>>>>> start > > > >>>>>>>>>>>> offset. > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> Such params affect query results - the > > > >>> semantics. > > > >>>>>>>> They'd > > > >>>>>>>>>>>> better > > > >>>>>>>>>>>>>> be as > > > >>>>>>>>>>>>>>>>>>> filter conditions in WHERE clause that > > > >> can > > > >>> be > > > >>>>>> pushed > > > >>>>>>>> down. > > > >>>>>>>>>>>> They > > > >>>>>>>>>>>>>> change > > > >>>>>>>>>>>>>>>>>>> almost every time a query starts and have > > > >>>>>> nothing to > > > >>>>>>>> do > > > >>>>>>>>>>> with > > > >>>>>>>>>>>>>> metadata, thus > > > >>>>>>>>>>>>>>>>>>> should not be part of table > > > >>> definition/DDL, nor > > > >>>>>> be > > > >>>>>>>>>>> persisted > > > >>>>>>>>>>>> in > > > >>>>>>>>>>>>>> catalogs. > > > >>>>>>>>>>>>>>>>>>> If they will, users should create views > > > >> to > > > >>> keep > > > >>>>>> such > > > >>>>>>>> params > > > >>>>>>>>>>>>>> around (note > > > >>>>>>>>>>>>>>>>>>> this is different from variable > > > >>> substitution). > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> Take Flink-Kafka as an example. Once we > > > >>> get these > > > >>>>>>>> params > > > >>>>>>>>>>>> right, > > > >>>>>>>>>>>>>> here're the > > > >>>>>>>>>>>>>>>>>>> steps users need to do to develop and run > > > >>> a Flink > > > >>>>>>>> job: > > > >>>>>>>>>>>>>>>>>>> - configure a Flink > > > >>> ConfluentSchemaRegistry with > > > >>>>>> url, > > > >>>>>>>>>>>> username, > > > >>>>>>>>>>>>>> and password > > > >>>>>>>>>>>>>>>>>>> - run "SELECT * FROM mykafka WHERE offset > > > >>>> 12pm > > > >>>>>>>> yesterday" > > > >>>>>>>>>>>>>> (simplified > > > >>>>>>>>>>>>>>>>>>> timestamp) in SQL CLI, Flink > > > >> automatically > > > >>>>>> retrieves > > > >>>>>>>> all > > > >>>>>>>>>>>>>> metadata of > > > >>>>>>>>>>>>>>>>>>> schema, file format, etc and start the > > > >> job > > > >>>>>>>>>>>>>>>>>>> - users want to make the job read Kafka > > > >>> topic > > > >>>>>>>> faster, so it > > > >>>>>>>>>>>>> goes > > > >>>>>>>>>>>>>> as "SELECT > > > >>>>>>>>>>>>>>>>>>> * FROM mykafka /* faster_read_key=value*/ > > > >>> WHERE > > > >>>>>>>> offset > > > > >>>>>>>>>>> 12pm > > > >>>>>>>>>>>>>> yesterday" > > > >>>>>>>>>>>>>>>>>>> - done and satisfied, users submit it to > > > >>>>>> production > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> Regarding "CREATE TABLE t LIKE with > > > >> (k1=v1, > > > >>>>>> k2=v2), > > > >>>>>>>> I think > > > >>>>>>>>>>>>> it's > > > >>>>>>>>>>>>>> a > > > >>>>>>>>>>>>>>>>>>> nice-to-have feature, but not a > > > >>> strategically > > > >>>>>>>> critical, > > > >>>>>>>>>>>>>> long-term solution, > > > >>>>>>>>>>>>>>>>>>> because > > > >>>>>>>>>>>>>>>>>>> 1) It may seem promising at the current > > > >>> stage to > > > >>>>>>>> solve the > > > >>>>>>>>>>>>>>>>>>> too-much-manual-work problem, but that's > > > >>> only > > > >>>>>>>> because Flink > > > >>>>>>>>>>>>>> hasn't > > > >>>>>>>>>>>>>>>>>>> leveraged catalogs well and handled the 3 > > > >>> types > > > >>>>>> of > > > >>>>>>>> params > > > >>>>>>>>>>>> above > > > >>>>>>>>>>>>>> properly. > > > >>>>>>>>>>>>>>>>>>> Once we get the params types right, the > > > >>> LIKE > > > >>>>>> syntax > > > >>>>>>>> won't > > > >>>>>>>>>>> be > > > >>>>>>>>>>>>> that > > > >>>>>>>>>>>>>>>>>>> important, and will be just an easier way > > > >>> to > > > >>>>>> create > > > >>>>>>>> tables > > > >>>>>>>>>>>>>> without retyping > > > >>>>>>>>>>>>>>>>>>> long fields like username and pwd. > > > >>>>>>>>>>>>>>>>>>> 2) Note that only some rare type of > > > >>> catalog can > > > >>>>>>>> store k-v > > > >>>>>>>>>>>>>> property pair, so > > > >>>>>>>>>>>>>>>>>>> table created this way often cannot be > > > >>>>>> persisted. In > > > >>>>>>>> the > > > >>>>>>>>>>>>>> foreseeable > > > >>>>>>>>>>>>>>>>>>> future, such catalog will only be > > > >>> HiveCatalog, > > > >>>>>> and > > > >>>>>>>> not > > > >>>>>>>>>>>> everyone > > > >>>>>>>>>>>>>> has a Hive > > > >>>>>>>>>>>>>>>>>>> metastore. To be honest, without > > > >>> persistence, > > > >>>>>>>> recreating > > > >>>>>>>>>>>> tables > > > >>>>>>>>>>>>>> every time > > > >>>>>>>>>>>>>>>>>>> this way is still a lot of keyboard > > > >> typing. > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> Cheers, > > > >>>>>>>>>>>>>>>>>>> Bowen > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>> On Tue, Mar 10, 2020 at 8:07 PM Kurt > > > >> Young > > > >>> < > > > >>>>>>>>>>> ykt...@gmail.com > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> If a specific connector want to have > > > >> such > > > >>>>>>>> parameter and > > > >>>>>>>>>>>> read > > > >>>>>>>>>>>>>> if out of > > > >>>>>>>>>>>>>>>>>>>> configuration, then that's fine. > > > >>>>>>>>>>>>>>>>>>>> If we are talking about a configuration > > > >>> for all > > > >>>>>>>> kinds of > > > >>>>>>>>>>>>>> sources, I would > > > >>>>>>>>>>>>>>>>>>>> be super careful about that. > > > >>>>>>>>>>>>>>>>>>>> It's true it can solve maybe 80% cases, > > > >>> but it > > > >>>>>>>> will also > > > >>>>>>>>>>>> make > > > >>>>>>>>>>>>>> the left 20% > > > >>>>>>>>>>>>>>>>>>>> feels weird. > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>>>> Kurt > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> On Wed, Mar 11, 2020 at 11:00 AM Jark > > > >> Wu > > > >>> < > > > >>>>>>>>>>> imj...@gmail.com > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>> Hi Kurt, > > > >>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>> #3 Regarding to global offset: > > > >>>>>>>>>>>>>>>>>>>>> I'm not saying to use the global > > > >>>>>> configuration to > > > >>>>>>>>>>>> override > > > >>>>>>>>>>>>>> connector > > > >>>>>>>>>>>>>>>>>>>>> properties by the planner. > > > >>>>>>>>>>>>>>>>>>>>> But the connector should take this > > > >>>>>> configuration > > > >>>>>>>> and > > > >>>>>>>>>>>>>> translate into their > > > >>>>>>>>>>>>>>>>>>>>> client API. > > > >>>>>>>>>>>>>>>>>>>>> AFAIK, almost all the message queues > > > >>> support > > > >>>>>>>> eariliest > > > >>>>>>>>>>>> and > > > >>>>>>>>>>>>>> latest and a > > > >>>>>>>>>>>>>>>>>>>>> timestamp value as start point. > > > >>>>>>>>>>>>>>>>>>>>> So we can support 3 options for this > > > >>>>>>>> configuration: > > > >>>>>>>>>>>>>> "eariliest", "latest" > > > >>>>>>>>>>>>>>>>>>>>> and a timestamp string value. > > > >>>>>>>>>>>>>>>>>>>>> Of course, this can't solve 100% > > > >>> cases, but I > > > >>>>>>>> guess can > > > >>>>>>>>>>>>>> sovle 80% or 90% > > > >>>>>>>>>>>>>>>>>>>>> cases. > > > >>>>>>>>>>>>>>>>>>>>> And the remaining cases can be > > > >>> resolved by > > > >>>>>> LIKE > > > >>>>>>>> syntax > > > >>>>>>>>>>>>> which > > > >>>>>>>>>>>>>> I guess is > > > >>>>>>>>>>>>>>>>>>>> not > > > >>>>>>>>>>>>>>>>>>>>> very common cases. > > > >>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>>>>> Jark > > > >>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>> On Wed, 11 Mar 2020 at 10:33, Kurt > > > >>> Young < > > > >>>>>>>>>>>> ykt...@gmail.com > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>> Good to have such lovely > > > >>> discussions. I > > > >>>>>> also > > > >>>>>>>> want to > > > >>>>>>>>>>>>> share > > > >>>>>>>>>>>>>> some of my > > > >>>>>>>>>>>>>>>>>>>>>> opinions. > > > >>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>> #1 Regarding to error handling: I > > > >>> also > > > >>>>>> think > > > >>>>>>>> ignore > > > >>>>>>>>>>>>>> invalid hints would > > > >>>>>>>>>>>>>>>>>>>>> be > > > >>>>>>>>>>>>>>>>>>>>>> dangerous, maybe > > > >>>>>>>>>>>>>>>>>>>>>> the simplest solution is just throw > > > >>> an > > > >>>>>>>> exception. > > > >>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>> #2 Regarding to property > > > >>> replacement: I > > > >>>>>> don't > > > >>>>>>>> think > > > >>>>>>>>>>> we > > > >>>>>>>>>>>>>> should > > > >>>>>>>>>>>>>>>>>>>> constraint > > > >>>>>>>>>>>>>>>>>>>>>> ourself to > > > >>>>>>>>>>>>>>>>>>>>>> the meaning of the word "hint", and > > > >>>>>> forbidden > > > >>>>>>>> it > > > >>>>>>>>>>>>> modifying > > > >>>>>>>>>>>>>> any > > > >>>>>>>>>>>>>>>>>>>> properties > > > >>>>>>>>>>>>>>>>>>>>>> which can effect > > > >>>>>>>>>>>>>>>>>>>>>> query results. IMO `PROPERTIES` is > > > >>> one of > > > >>>>>> the > > > >>>>>>>> table > > > >>>>>>>>>>>>> hints, > > > >>>>>>>>>>>>>> and a > > > >>>>>>>>>>>>>>>>>>>> powerful > > > >>>>>>>>>>>>>>>>>>>>>> one. It can > > > >>>>>>>>>>>>>>>>>>>>>> modify properties located in DDL's > > > >>> WITH > > > >>>>>> block. > > > >>>>>>>> But I > > > >>>>>>>>>>>> also > > > >>>>>>>>>>>>>> see the harm > > > >>>>>>>>>>>>>>>>>>>>> that > > > >>>>>>>>>>>>>>>>>>>>>> if we make it > > > >>>>>>>>>>>>>>>>>>>>>> too flexible like change the kafka > > > >>> topic > > > >>>>>> name > > > >>>>>>>> with a > > > >>>>>>>>>>>>> hint. > > > >>>>>>>>>>>>>> Such use > > > >>>>>>>>>>>>>>>>>>>> case > > > >>>>>>>>>>>>>>>>>>>>> is > > > >>>>>>>>>>>>>>>>>>>>>> not common and > > > >>>>>>>>>>>>>>>>>>>>>> sounds very dangerous to me. I > > > >> would > > > >>>>>> propose > > > >>>>>>>> we have > > > >>>>>>>>>>> a > > > >>>>>>>>>>>>> map > > > >>>>>>>>>>>>>> of hintable > > > >>>>>>>>>>>>>>>>>>>>>> properties for each > > > >>>>>>>>>>>>>>>>>>>>>> connector, and should validate all > > > >>> passed > > > >>>>>> in > > > >>>>>>>>>>> properties > > > >>>>>>>>>>>>>> are actually > > > >>>>>>>>>>>>>>>>>>>>>> hintable. And combining with > > > >>>>>>>>>>>>>>>>>>>>>> #1 error handling, we can throw an > > > >>>>>> exception > > > >>>>>>>> once > > > >>>>>>>>>>>>> received > > > >>>>>>>>>>>>>> invalid > > > >>>>>>>>>>>>>>>>>>>>>> property. > > > >>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>> #3 Regarding to global offset: I'm > > > >>> not sure > > > >>>>>>>> it's > > > >>>>>>>>>>>>> feasible. > > > >>>>>>>>>>>>>> Different > > > >>>>>>>>>>>>>>>>>>>>>> connectors will have totally > > > >>>>>>>>>>>>>>>>>>>>>> different properties to represent > > > >>> offset, > > > >>>>>> some > > > >>>>>>>> might > > > >>>>>>>>>>> be > > > >>>>>>>>>>>>>> timestamps, > > > >>>>>>>>>>>>>>>>>>>> some > > > >>>>>>>>>>>>>>>>>>>>>> might be string literals > > > >>>>>>>>>>>>>>>>>>>>>> like "earliest", and others might > > > >> be > > > >>> just > > > >>>>>>>> integers. > > > >>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>>>>>> Kurt > > > >>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>> On Tue, Mar 10, 2020 at 11:46 PM > > > >>> Jark Wu < > > > >>>>>>>>>>>>> imj...@gmail.com> > > > >>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>> Hi everyone, > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>> I want to jump in the discussion > > > >>> about > > > >>>>>> the > > > >>>>>>>> "dynamic > > > >>>>>>>>>>>>>> start offset" > > > >>>>>>>>>>>>>>>>>>>>>> problem. > > > >>>>>>>>>>>>>>>>>>>>>>> First of all, I share the same > > > >>> concern > > > >>>>>> with > > > >>>>>>>> Timo > > > >>>>>>>>>>> and > > > >>>>>>>>>>>>>> Fabian, that the > > > >>>>>>>>>>>>>>>>>>>>>>> "start offset" affects the query > > > >>>>>> semantics, > > > >>>>>>>> i.e. > > > >>>>>>>>>>> the > > > >>>>>>>>>>>>>> query result. > > > >>>>>>>>>>>>>>>>>>>>>>> But "hints" is just used for > > > >>> optimization > > > >>>>>>>> which > > > >>>>>>>>>>>> should > > > >>>>>>>>>>>>>> affect the > > > >>>>>>>>>>>>>>>>>>>>> result? > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>> I think the "dynamic start > > > >> offset" > > > >>> is an > > > >>>>>> very > > > >>>>>>>>>>>> important > > > >>>>>>>>>>>>>> usability > > > >>>>>>>>>>>>>>>>>>>>> problem > > > >>>>>>>>>>>>>>>>>>>>>>> which will be faced by many > > > >>> streaming > > > >>>>>>>> platforms. > > > >>>>>>>>>>>>>>>>>>>>>>> I also agree "CREATE TEMPORARY > > > >>> TABLE Temp > > > >>>>>>>> (LIKE t) > > > >>>>>>>>>>>> WITH > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>> ('connector.startup-timestamp-millis' = > > > >>>>>>>>>>>>>> '1578538374471')" is verbose, > > > >>>>>>>>>>>>>>>>>>>>>> what > > > >>>>>>>>>>>>>>>>>>>>>>> if we have 10 tables to join? > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>> However, what I want to propose > > > >>> (should > > > >>>>>> be > > > >>>>>>>> another > > > >>>>>>>>>>>>>> thread) is a > > > >>>>>>>>>>>>>>>>>>>> global > > > >>>>>>>>>>>>>>>>>>>>>>> configuration to reset start > > > >>> offsets of > > > >>>>>> all > > > >>>>>>>> the > > > >>>>>>>>>>>> source > > > >>>>>>>>>>>>>> connectors > > > >>>>>>>>>>>>>>>>>>>>>>> in the query session, e.g. > > > >>>>>>>>>>>>> "table.sources.start-offset". > > > >>>>>>>>>>>>>> This is > > > >>>>>>>>>>>>>>>>>>>>> possible > > > >>>>>>>>>>>>>>>>>>>>>>> now because > > > >>> `TableSourceFactory.Context` > > > >>>>>> has > > > >>>>>>>>>>>>>> `getConfiguration` > > > >>>>>>>>>>>>>>>>>>>>>>> method to get the session > > > >>> configuration, > > > >>>>>> and > > > >>>>>>>> use it > > > >>>>>>>>>>>> to > > > >>>>>>>>>>>>>> create an > > > >>>>>>>>>>>>>>>>>>>>> adapted > > > >>>>>>>>>>>>>>>>>>>>>>> TableSource. > > > >>>>>>>>>>>>>>>>>>>>>>> Then we can also expose to SQL > > > >> CLI > > > >>> via > > > >>>>>> SET > > > >>>>>>>> command, > > > >>>>>>>>>>>>> e.g. > > > >>>>>>>>>>>>>> `SET > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>> 'table.sources.start-offset'='earliest';`, > > > >>>>>>>> which is > > > >>>>>>>>>>>>>> pretty simple and > > > >>>>>>>>>>>>>>>>>>>>>>> straightforward. > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>> This is very similar to KSQL's > > > >> `SET > > > >>>>>>>>>>>>>> 'auto.offset.reset'='earliest'` > > > >>>>>>>>>>>>>>>>>>>>> which > > > >>>>>>>>>>>>>>>>>>>>>>> is very helpful IMO. > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>>>>>>> Jark > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>> On Tue, 10 Mar 2020 at 22:29, > > > >> Timo > > > >>>>>> Walther < > > > >>>>>>>>>>>>>> twal...@apache.org> > > > >>>>>>>>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> Hi Danny, > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> compared to the hints, FLIP-110 > > > >>> is > > > >>>>>> fully > > > >>>>>>>>>>> compliant > > > >>>>>>>>>>>> to > > > >>>>>>>>>>>>>> the SQL > > > >>>>>>>>>>>>>>>>>>>>> standard. > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> I don't think that `CREATE > > > >>> TEMPORARY > > > >>>>>> TABLE > > > >>>>>>>> Temp > > > >>>>>>>>>>>> (LIKE > > > >>>>>>>>>>>>>> t) WITH > > > >>>>>>>>>>>>>>>>>>>> (k=v)` > > > >>>>>>>>>>>>>>>>>>>>> is > > > >>>>>>>>>>>>>>>>>>>>>>>> too verbose or awkward for the > > > >>> power of > > > >>>>>>>> basically > > > >>>>>>>>>>>>>> changing the > > > >>>>>>>>>>>>>>>>>>>> entire > > > >>>>>>>>>>>>>>>>>>>>>>>> connector. Usually, this > > > >>> statement > > > >>>>>> would > > > >>>>>>>> just > > > >>>>>>>>>>>> precede > > > >>>>>>>>>>>>>> the query in > > > >>>>>>>>>>>>>>>>>>>> a > > > >>>>>>>>>>>>>>>>>>>>>>>> multiline file. So it can be > > > >>> change > > > >>>>>>>> "in-place" > > > >>>>>>>>>>> like > > > >>>>>>>>>>>>>> the hints you > > > >>>>>>>>>>>>>>>>>>>>>>> proposed. > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> Many companies have a > > > >>> well-defined set > > > >>>>>> of > > > >>>>>>>> tables > > > >>>>>>>>>>>> that > > > >>>>>>>>>>>>>> should be > > > >>>>>>>>>>>>>>>>>>>> used. > > > >>>>>>>>>>>>>>>>>>>>>> It > > > >>>>>>>>>>>>>>>>>>>>>>>> would be dangerous if users can > > > >>> change > > > >>>>>> the > > > >>>>>>>> path > > > >>>>>>>>>>> or > > > >>>>>>>>>>>>>> topic in a hint. > > > >>>>>>>>>>>>>>>>>>>>> The > > > >>>>>>>>>>>>>>>>>>>>>>>> catalog/catalog manager should > > > >>> be the > > > >>>>>>>> entity that > > > >>>>>>>>>>>>>> controls which > > > >>>>>>>>>>>>>>>>>>>>> tables > > > >>>>>>>>>>>>>>>>>>>>>>>> exist and how they can be > > > >>> accessed. > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>> what’s the problem there if > > > >> we > > > >>> user > > > >>>>>> the > > > >>>>>>>> table > > > >>>>>>>>>>>> hints > > > >>>>>>>>>>>>>> to support > > > >>>>>>>>>>>>>>>>>>>>>> “start > > > >>>>>>>>>>>>>>>>>>>>>>>> offset”? > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> IMHO it violates the meaning of > > > >>> a hint. > > > >>>>>>>> According > > > >>>>>>>>>>>> to > > > >>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>>>> dictionary, > > > >>>>>>>>>>>>>>>>>>>>> a > > > >>>>>>>>>>>>>>>>>>>>>>>> hint is "a statement that > > > >>> expresses > > > >>>>>>>> indirectly > > > >>>>>>>>>>> what > > > >>>>>>>>>>>>>> one prefers not > > > >>>>>>>>>>>>>>>>>>>>> to > > > >>>>>>>>>>>>>>>>>>>>>>>> say explicitly". But offsets > > > >> are > > > >>> a > > > >>>>>>>> property that > > > >>>>>>>>>>>> are > > > >>>>>>>>>>>>>> very explicit. > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> If we go with the hint > > > >> approach, > > > >>> it > > > >>>>>> should > > > >>>>>>>> be > > > >>>>>>>>>>>>>> expressible in the > > > >>>>>>>>>>>>>>>>>>>>>>>> TableSourceFactory which > > > >>> properties are > > > >>>>>>>> supported > > > >>>>>>>>>>>> for > > > >>>>>>>>>>>>>> hinting. Or > > > >>>>>>>>>>>>>>>>>>>> do > > > >>>>>>>>>>>>>>>>>>>>>> you > > > >>>>>>>>>>>>>>>>>>>>>>>> plan to offer those hints in a > > > >>> separate > > > >>>>>>>>>>> Map<String, > > > >>>>>>>>>>>>>> String> that > > > >>>>>>>>>>>>>>>>>>>>> cannot > > > >>>>>>>>>>>>>>>>>>>>>>>> overwrite existing properties? > > > >> I > > > >>> think > > > >>>>>>>> this would > > > >>>>>>>>>>>> be > > > >>>>>>>>>>>>> a > > > >>>>>>>>>>>>>> different > > > >>>>>>>>>>>>>>>>>>>>>> story... > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> Regards, > > > >>>>>>>>>>>>>>>>>>>>>>>> Timo > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> On 10.03.20 10:34, Danny Chan > > > >>> wrote: > > > >>>>>>>>>>>>>>>>>>>>>>>>> Thanks Timo ~ > > > >>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>> Personally I would say that > > > >>> offset > > > > >>>>>> 0 > > > >>>>>>>> and > > > >>>>>>>>>>> start > > > >>>>>>>>>>>>>> offset = 10 does > > > >>>>>>>>>>>>>>>>>>>>> not > > > >>>>>>>>>>>>>>>>>>>>>>>> have the same semantic, so from > > > >>> the SQL > > > >>>>>>>> aspect, > > > >>>>>>>>>>> we > > > >>>>>>>>>>>>> can > > > >>>>>>>>>>>>>> not > > > >>>>>>>>>>>>>>>>>>>> implement > > > >>>>>>>>>>>>>>>>>>>>> a > > > >>>>>>>>>>>>>>>>>>>>>>>> “starting offset” hint for > > > >> query > > > >>> with > > > >>>>>> such > > > >>>>>>>> a > > > >>>>>>>>>>>> syntax. > > > >>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>> And the CREATE TABLE LIKE > > > >>> syntax is a > > > >>>>>>>> DDL which > > > >>>>>>>>>>>> is > > > >>>>>>>>>>>>>> just verbose > > > >>>>>>>>>>>>>>>>>>>> for > > > >>>>>>>>>>>>>>>>>>>>>>>> defining such dynamic > > > >> parameters > > > >>> even > > > >>>>>> if > > > >>>>>>>> it could > > > >>>>>>>>>>>> do > > > >>>>>>>>>>>>>> that, shall we > > > >>>>>>>>>>>>>>>>>>>>>> force > > > >>>>>>>>>>>>>>>>>>>>>>>> users to define a temporal > > > >> table > > > >>> for > > > >>>>>> each > > > >>>>>>>> query > > > >>>>>>>>>>>> with > > > >>>>>>>>>>>>>> dynamic > > > >>>>>>>>>>>>>>>>>>>> params, > > > >>>>>>>>>>>>>>>>>>>>> I > > > >>>>>>>>>>>>>>>>>>>>>>>> would say it’s an awkward > > > >>> solution. > > > >>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>> "Hints should give "hints" > > > >> but > > > >>> not > > > >>>>>>>> affect the > > > >>>>>>>>>>>>> actual > > > >>>>>>>>>>>>>> produced > > > >>>>>>>>>>>>>>>>>>>>>> result.” > > > >>>>>>>>>>>>>>>>>>>>>>>> You mentioned that multiple > > > >>> times and > > > >>>>>>>> could we > > > >>>>>>>>>>>> give a > > > >>>>>>>>>>>>>> reason, > > > >>>>>>>>>>>>>>>>>>>> what’s > > > >>>>>>>>>>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>>>>>>>> problem there if we user the > > > >>> table > > > >>>>>> hints to > > > >>>>>>>>>>> support > > > >>>>>>>>>>>>>> “start offset” > > > >>>>>>>>>>>>>>>>>>>> ? > > > >>>>>>>>>>>>>>>>>>>>>> From > > > >>>>>>>>>>>>>>>>>>>>>>>> my side I saw some benefits for > > > >>> that: > > > >>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>> • It’s very convent to set up > > > >>> these > > > >>>>>>>> parameters, > > > >>>>>>>>>>>> the > > > >>>>>>>>>>>>>> syntax is > > > >>>>>>>>>>>>>>>>>>>> very > > > >>>>>>>>>>>>>>>>>>>>>> much > > > >>>>>>>>>>>>>>>>>>>>>>>> like the DDL definition > > > >>>>>>>>>>>>>>>>>>>>>>>>> • It’s scope is very clear, > > > >>> right on > > > >>>>>> the > > > >>>>>>>> table > > > >>>>>>>>>>> it > > > >>>>>>>>>>>>>> attathed > > > >>>>>>>>>>>>>>>>>>>>>>>>> • It does not affect the > > > >> table > > > >>>>>> schema, > > > >>>>>>>> which > > > >>>>>>>>>>>> means > > > >>>>>>>>>>>>>> in order to > > > >>>>>>>>>>>>>>>>>>>>>> specify > > > >>>>>>>>>>>>>>>>>>>>>>>> the offset, there is no need to > > > >>> define > > > >>>>>> an > > > >>>>>>>> offset > > > >>>>>>>>>>>>>> column which is > > > >>>>>>>>>>>>>>>>>>>>> weird > > > >>>>>>>>>>>>>>>>>>>>>>>> actually, offset should never > > > >> be > > > >>> a > > > >>>>>> column, > > > >>>>>>>> it’s > > > >>>>>>>>>>>> more > > > >>>>>>>>>>>>>> like a > > > >>>>>>>>>>>>>>>>>>>> metadata > > > >>>>>>>>>>>>>>>>>>>>>> or a > > > >>>>>>>>>>>>>>>>>>>>>>>> start option. > > > >>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>> So in total, FLIP-110 uses > > > >> the > > > >>> offset > > > >>>>>>>> more > > > >>>>>>>>>>> like a > > > >>>>>>>>>>>>>> Hive partition > > > >>>>>>>>>>>>>>>>>>>>>> prune, > > > >>>>>>>>>>>>>>>>>>>>>>>> we can do that if we have an > > > >>> offset > > > >>>>>>>> column, but > > > >>>>>>>>>>>> most > > > >>>>>>>>>>>>>> of the case we > > > >>>>>>>>>>>>>>>>>>>>> do > > > >>>>>>>>>>>>>>>>>>>>>>> not > > > >>>>>>>>>>>>>>>>>>>>>>>> define that, so there is > > > >>> actually no > > > >>>>>>>> conflict or > > > >>>>>>>>>>>>>> overlap. > > > >>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>>>>>>>>> Danny Chan > > > >>>>>>>>>>>>>>>>>>>>>>>>> 在 2020年3月10日 +0800 > > > >> PM4:28,Timo > > > >>>>>> Walther < > > > >>>>>>>>>>>>>> twal...@apache.org>,写道: > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danny, > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> shouldn't FLIP-110[1] solve > > > >>> most > > > >>>>>> of the > > > >>>>>>>>>>>> problems > > > >>>>>>>>>>>>>> we have around > > > >>>>>>>>>>>>>>>>>>>>>>> defining > > > >>>>>>>>>>>>>>>>>>>>>>>>>> table properties more > > > >>> dynamically > > > >>>>>>>> without > > > >>>>>>>>>>>> manual > > > >>>>>>>>>>>>>> schema work? > > > >>>>>>>>>>>>>>>>>>>> Also > > > >>>>>>>>>>>>>>>>>>>>>>>>>> offset definition is easier > > > >>> with > > > >>>>>> such a > > > >>>>>>>>>>> syntax. > > > >>>>>>>>>>>>>> They must not be > > > >>>>>>>>>>>>>>>>>>>>>>> defined > > > >>>>>>>>>>>>>>>>>>>>>>>>>> in catalog but could be > > > >>> temporary > > > >>>>>>>> tables that > > > >>>>>>>>>>>>>> extend from the > > > >>>>>>>>>>>>>>>>>>>>>> original > > > >>>>>>>>>>>>>>>>>>>>>>>>>> table. > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> In general, we should aim > > > >> to > > > >>> keep > > > >>>>>> the > > > >>>>>>>> syntax > > > >>>>>>>>>>>>>> concise and don't > > > >>>>>>>>>>>>>>>>>>>>>> provide > > > >>>>>>>>>>>>>>>>>>>>>>>>>> too many ways of doing the > > > >>> same > > > >>>>>> thing. > > > >>>>>>>> Hints > > > >>>>>>>>>>>>>> should give "hints" > > > >>>>>>>>>>>>>>>>>>>>> but > > > >>>>>>>>>>>>>>>>>>>>>>> not > > > >>>>>>>>>>>>>>>>>>>>>>>>>> affect the actual produced > > > >>> result. > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Some connector properties > > > >>> might > > > >>>>>> also > > > >>>>>>>> change > > > >>>>>>>>>>> the > > > >>>>>>>>>>>>>> plan or schema > > > >>>>>>>>>>>>>>>>>>>> in > > > >>>>>>>>>>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>>>>>>>>>> future. E.g. they might > > > >> also > > > >>> define > > > >>>>>>>> whether a > > > >>>>>>>>>>>>>> table source > > > >>>>>>>>>>>>>>>>>>>>> supports > > > >>>>>>>>>>>>>>>>>>>>>>>>>> certain push-downs (e.g. > > > >>> predicate > > > >>>>>>>>>>> push-down). > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Dawid is currently working > > > >> a > > > >>> draft > > > >>>>>>>> that might > > > >>>>>>>>>>>>>> makes it possible > > > >>>>>>>>>>>>>>>>>>>> to > > > >>>>>>>>>>>>>>>>>>>>>>>>>> expose a Kafka offset via > > > >> the > > > >>>>>> schema > > > >>>>>>>> such > > > >>>>>>>>>>> that > > > >>>>>>>>>>>>>> `SELECT * FROM > > > >>>>>>>>>>>>>>>>>>>>> Topic > > > >>>>>>>>>>>>>>>>>>>>>>>>>> WHERE offset > 10` would > > > >>> become > > > >>>>>>>> possible and > > > >>>>>>>>>>>>> could > > > >>>>>>>>>>>>>> be pushed > > > >>>>>>>>>>>>>>>>>>>> down. > > > >>>>>>>>>>>>>>>>>>>>>> But > > > >>>>>>>>>>>>>>>>>>>>>>>>>> this is of course, not > > > >>> planned > > > >>>>>>>> initially. > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Regards, > > > >>>>>>>>>>>>>>>>>>>>>>>>>> Timo > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> [1] > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>> > > > >>>>>> > > > >>> > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> On 10.03.20 08:34, Danny > > > >> Chan > > > >>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Wenlong ~ > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> For PROPERTIES Hint Error > > > >>>>>> handling > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> Actually we have no way > > > >> to > > > >>>>>> figure out > > > >>>>>>>>>>>> whether a > > > >>>>>>>>>>>>>> error prone > > > >>>>>>>>>>>>>>>>>>>> hint > > > >>>>>>>>>>>>>>>>>>>>>> is a > > > >>>>>>>>>>>>>>>>>>>>>>>> PROPERTIES hint, for example, > > > >> if > > > >>> use > > > >>>>>>>> writes a > > > >>>>>>>>>>> hint > > > >>>>>>>>>>>>> like > > > >>>>>>>>>>>>>>>>>>>> ‘PROPERTIAS’, > > > >>>>>>>>>>>>>>>>>>>>>> we > > > >>>>>>>>>>>>>>>>>>>>>>> do > > > >>>>>>>>>>>>>>>>>>>>>>>> not know if this hint is a > > > >>> PROPERTIES > > > >>>>>>>> hint, what > > > >>>>>>>>>>> we > > > >>>>>>>>>>>>>> know is that > > > >>>>>>>>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>>>>>> hint > > > >>>>>>>>>>>>>>>>>>>>>>>> name was not registered in our > > > >>> Flink. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> If the user writes the > > > >>> hint name > > > >>>>>>>> correctly > > > >>>>>>>>>>>>> (i.e. > > > >>>>>>>>>>>>>> PROPERTIES), > > > >>>>>>>>>>>>>>>>>>>> we > > > >>>>>>>>>>>>>>>>>>>>>> did > > > >>>>>>>>>>>>>>>>>>>>>>>> can enforce the validation of > > > >>> the hint > > > >>>>>>>> options > > > >>>>>>>>>>>> though > > > >>>>>>>>>>>>>> the pluggable > > > >>>>>>>>>>>>>>>>>>>>>>>> HintOptionChecker. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> For PROPERTIES Hint > > > >> Option > > > >>> Format > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> For a key value style > > > >> hint > > > >>>>>> option, > > > >>>>>>>> the key > > > >>>>>>>>>>>> can > > > >>>>>>>>>>>>>> be either a > > > >>>>>>>>>>>>>>>>>>>> simple > > > >>>>>>>>>>>>>>>>>>>>>>>> identifier or a string literal, > > > >>> which > > > >>>>>>>> means that > > > >>>>>>>>>>>> it’s > > > >>>>>>>>>>>>>> compatible > > > >>>>>>>>>>>>>>>>>>>> with > > > >>>>>>>>>>>>>>>>>>>>>> our > > > >>>>>>>>>>>>>>>>>>>>>>>> DDL syntax. We support simple > > > >>>>>> identifier > > > >>>>>>>> because > > > >>>>>>>>>>>> many > > > >>>>>>>>>>>>>> other hints > > > >>>>>>>>>>>>>>>>>>>> do > > > >>>>>>>>>>>>>>>>>>>>>> not > > > >>>>>>>>>>>>>>>>>>>>>>>> have the component complex keys > > > >>> like > > > >>>>>> the > > > >>>>>>>> table > > > >>>>>>>>>>>>>> properties, and we > > > >>>>>>>>>>>>>>>>>>>>> want > > > >>>>>>>>>>>>>>>>>>>>>> to > > > >>>>>>>>>>>>>>>>>>>>>>>> unify the parse block. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> Danny Chan > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> 在 2020年3月10日 +0800 > > > >>>>>>>> PM3:19,wenlong.lwl < > > > >>>>>>>>>>>>>> wenlong88....@gmail.com > > > >>>>>>>>>>>>>>>>>>>>>>> ,写道: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danny, thanks for > > > >> the > > > >>>>>> proposal. > > > >>>>>>>> +1 for > > > >>>>>>>>>>>>>> adding table hints, > > > >>>>>>>>>>>>>>>>>>>> it > > > >>>>>>>>>>>>>>>>>>>>>> is > > > >>>>>>>>>>>>>>>>>>>>>>>> really > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> a necessary feature for > > > >>> flink > > > >>>>>> sql > > > >>>>>>>> to > > > >>>>>>>>>>>>> integrate > > > >>>>>>>>>>>>>> with a catalog. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> For error handling, I > > > >>> think it > > > >>>>>>>> would be > > > >>>>>>>>>>>> more > > > >>>>>>>>>>>>>> natural to throw > > > >>>>>>>>>>>>>>>>>>>> an > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> exception when error > > > >>> table hint > > > >>>>>>>> provided, > > > >>>>>>>>>>>>>> because the > > > >>>>>>>>>>>>>>>>>>>> properties > > > >>>>>>>>>>>>>>>>>>>>>> in > > > >>>>>>>>>>>>>>>>>>>>>>>> hint > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> will be merged and used > > > >>> to find > > > >>>>>>>> the table > > > >>>>>>>>>>>>>> factory which would > > > >>>>>>>>>>>>>>>>>>>>>> cause > > > >>>>>>>>>>>>>>>>>>>>>>> an > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> exception when error > > > >>> properties > > > >>>>>>>> provided, > > > >>>>>>>>>>>>>> right? On the other > > > >>>>>>>>>>>>>>>>>>>>>> hand, > > > >>>>>>>>>>>>>>>>>>>>>>>> unlike > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> other hints which just > > > >>> affect > > > >>>>>> the > > > >>>>>>>> way to > > > >>>>>>>>>>>>>> execute the query, > > > >>>>>>>>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>>>>>>>> property > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> table hint actually > > > >>> affects the > > > >>>>>>>> result of > > > >>>>>>>>>>>> the > > > >>>>>>>>>>>>>> query, we should > > > >>>>>>>>>>>>>>>>>>>>>> never > > > >>>>>>>>>>>>>>>>>>>>>>>> ignore > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> the given property > > > >> hints. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> For the format of > > > >>> property > > > >>>>>> hints, > > > >>>>>>>>>>>> currently, > > > >>>>>>>>>>>>>> in sql client, we > > > >>>>>>>>>>>>>>>>>>>>>>> accept > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> properties in format of > > > >>> string > > > >>>>>>>> only in > > > >>>>>>>>>>> DDL: > > > >>>>>>>>>>>>>>>>>>>>>>> 'connector.type'='kafka', > > > >>>>>>>>>>>>>>>>>>>>>>>> I > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> think the format of > > > >>> properties > > > >>>>>> in > > > >>>>>>>> hint > > > >>>>>>>>>>>> should > > > >>>>>>>>>>>>>> be the same as > > > >>>>>>>>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>>>>>>>> format we > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> defined in ddl. What do > > > >>> you > > > >>>>>> think? > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Bests, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Wenlong Lyu > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, 10 Mar 2020 at > > > >>> 14:22, > > > >>>>>>>> Danny Chan > > > >>>>>>>>>>> < > > > >>>>>>>>>>>>>>>>>>>> yuzhao....@gmail.com> > > > >>>>>>>>>>>>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> To Weike: About the > > > >>> Error > > > >>>>>> Handing > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> To be consistent with > > > >>> other > > > >>>>>> SQL > > > >>>>>>>>>>> vendors, > > > >>>>>>>>>>>>> the > > > >>>>>>>>>>>>>> default is to > > > >>>>>>>>>>>>>>>>>>>> log > > > >>>>>>>>>>>>>>>>>>>>>>>> warnings > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> and if there is any > > > >>> error > > > >>>>>>>> (invalid hint > > > >>>>>>>>>>>>> name > > > >>>>>>>>>>>>>> or options), the > > > >>>>>>>>>>>>>>>>>>>>>> hint > > > >>>>>>>>>>>>>>>>>>>>>>>> is just > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ignored. I have > > > >> already > > > >>>>>>>> addressed in > > > >>>>>>>>>>> the > > > >>>>>>>>>>>>>> wiki. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> To Timo: About the > > > >>> PROPERTIES > > > >>>>>>>> Table > > > >>>>>>>>>>> Hint > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> • The properties > > > >> hints > > > >>> is > > > >>>>>> also > > > >>>>>>>>>>> optional, > > > >>>>>>>>>>>>>> user can pass in an > > > >>>>>>>>>>>>>>>>>>>>>> option > > > >>>>>>>>>>>>>>>>>>>>>>>> to > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> override the table > > > >>> properties > > > >>>>>>>> but this > > > >>>>>>>>>>>> does > > > >>>>>>>>>>>>>> not mean it is > > > >>>>>>>>>>>>>>>>>>>>>>> required. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> • They should not > > > >>> include > > > >>>>>>>> semantics: > > > >>>>>>>>>>> does > > > >>>>>>>>>>>>>> the properties > > > >>>>>>>>>>>>>>>>>>>> belong > > > >>>>>>>>>>>>>>>>>>>>>> to > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> semantic ? I don't > > > >>> think so, > > > >>>>>> the > > > >>>>>>>> plan > > > >>>>>>>>>>>> does > > > >>>>>>>>>>>>>> not change right ? > > > >>>>>>>>>>>>>>>>>>>>> The > > > >>>>>>>>>>>>>>>>>>>>>>>> result > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> set may be affected, > > > >>> but > > > >>>>>> there > > > >>>>>>>> are > > > >>>>>>>>>>>> already > > > >>>>>>>>>>>>>> some hints do so, > > > >>>>>>>>>>>>>>>>>>>>> for > > > >>>>>>>>>>>>>>>>>>>>>>>> example, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> MS-SQL MAXRECURSION > > > >> and > > > >>>>>> SNAPSHOT > > > >>>>>>>> hint > > > >>>>>>>>>>> [1] > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> • `SELECT * FROM > > > >> t(k=v, > > > >>>>>> k=v)`: > > > >>>>>>>> this > > > >>>>>>>>>>>> grammar > > > >>>>>>>>>>>>>> breaks the SQL > > > >>>>>>>>>>>>>>>>>>>>>> standard > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> compared to the hints > > > >>>>>> way(which > > > >>>>>>>> is > > > >>>>>>>>>>>> included > > > >>>>>>>>>>>>>> in comments) > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> • I actually didn't > > > >>> found any > > > >>>>>>>> vendors > > > >>>>>>>>>>> to > > > >>>>>>>>>>>>>> support such > > > >>>>>>>>>>>>>>>>>>>> grammar, > > > >>>>>>>>>>>>>>>>>>>>>> and > > > >>>>>>>>>>>>>>>>>>>>>>>> there > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> is no way to override > > > >>> table > > > >>>>>> level > > > >>>>>>>>>>>>> properties > > > >>>>>>>>>>>>>> dynamically. For > > > >>>>>>>>>>>>>>>>>>>>>>> normal > > > >>>>>>>>>>>>>>>>>>>>>>>> RDBMS, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think there are no > > > >>> requests > > > >>>>>>>> for such > > > >>>>>>>>>>>>>> dynamic parameters > > > >>>>>>>>>>>>>>>>>>>>> because > > > >>>>>>>>>>>>>>>>>>>>>>>> all the > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> table have the same > > > >>> storage > > > >>>>>> and > > > >>>>>>>>>>>> computation > > > >>>>>>>>>>>>>> and they are > > > >>>>>>>>>>>>>>>>>>>> almost > > > >>>>>>>>>>>>>>>>>>>>>> all > > > >>>>>>>>>>>>>>>>>>>>>>>> batch > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> tables. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> • While Flink as a > > > >>>>>> computation > > > >>>>>>>> engine > > > >>>>>>>>>>> has > > > >>>>>>>>>>>>>> many connectors, > > > >>>>>>>>>>>>>>>>>>>>>>>> especially for > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> some message queue > > > >> like > > > >>>>>> Kafka, > > > >>>>>>>> we would > > > >>>>>>>>>>>>> have > > > >>>>>>>>>>>>>> a start_offset > > > >>>>>>>>>>>>>>>>>>>>> which > > > >>>>>>>>>>>>>>>>>>>>>>> is > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> different each time > > > >> we > > > >>> start > > > >>>>>> the > > > >>>>>>>> query, > > > >>>>>>>>>>>>> such > > > >>>>>>>>>>>>>> parameters can > > > >>>>>>>>>>>>>>>>>>>> not > > > >>>>>>>>>>>>>>>>>>>>>> be > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> persisted to catalog, > > > >>> because > > > >>>>>>>> it’s not > > > >>>>>>>>>>>>>> static, this is > > > >>>>>>>>>>>>>>>>>>>> actually > > > >>>>>>>>>>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> background we propose > > > >>> the > > > >>>>>> table > > > >>>>>>>> hints > > > >>>>>>>>>>> to > > > >>>>>>>>>>>>>> indicate such > > > >>>>>>>>>>>>>>>>>>>>> properties > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamically. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> To Jark and Jinsong: > > > >> I > > > >>> have > > > >>>>>>>> removed the > > > >>>>>>>>>>>>>> query hints part and > > > >>>>>>>>>>>>>>>>>>>>>> change > > > >>>>>>>>>>>>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> title. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>> > > > >>>>>> > > > >>> > > > >> > > > > > > https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-query?view=sql-server-ver15 > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny Chan > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> 在 2020年3月9日 +0800 > > > >>> PM5:46,Timo > > > >>>>>>>> Walther < > > > >>>>>>>>>>>>>> twal...@apache.org > > > >>>>>>>>>>>>>>>>>>>>> ,写道: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danny, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thanks for the > > > >>> proposal. I > > > >>>>>>>> agree with > > > >>>>>>>>>>>>> Jark > > > >>>>>>>>>>>>>> and Jingsong. > > > >>>>>>>>>>>>>>>>>>>>> Planner > > > >>>>>>>>>>>>>>>>>>>>>>>> hints > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and table hints are > > > >>>>>> orthogonal > > > >>>>>>>> topics > > > >>>>>>>>>>>>> that > > > >>>>>>>>>>>>>> should be > > > >>>>>>>>>>>>>>>>>>>> discussed > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> separately. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I share Jingsong's > > > >>> opinion > > > >>>>>>>> that we > > > >>>>>>>>>>>> should > > > >>>>>>>>>>>>>> not use planner > > > >>>>>>>>>>>>>>>>>>>>> hints > > > >>>>>>>>>>>>>>>>>>>>>>> for > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> passing connector > > > >>>>>> properties. > > > >>>>>>>> Planner > > > >>>>>>>>>>>>>> hints should be > > > >>>>>>>>>>>>>>>>>>>> optional > > > >>>>>>>>>>>>>>>>>>>>>> at > > > >>>>>>>>>>>>>>>>>>>>>>>> any > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> time. They should > > > >> not > > > >>>>>> include > > > >>>>>>>>>>> semantics > > > >>>>>>>>>>>>>> but only affect > > > >>>>>>>>>>>>>>>>>>>>>> execution > > > >>>>>>>>>>>>>>>>>>>>>>>> time. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Connector > > > >> properties > > > >>> are an > > > >>>>>>>> important > > > >>>>>>>>>>>>> part > > > >>>>>>>>>>>>>> of the query > > > >>>>>>>>>>>>>>>>>>>>> itself. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Have you thought > > > >>> about > > > >>>>>> options > > > >>>>>>>> such > > > >>>>>>>>>>> as > > > >>>>>>>>>>>>>> `SELECT * FROM t(k=v, > > > >>>>>>>>>>>>>>>>>>>>>>> k=v)`? > > > >>>>>>>>>>>>>>>>>>>>>>>> How > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are other vendors > > > >>> deal with > > > >>>>>>>> this > > > >>>>>>>>>>>> problem? > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regards, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Timo > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 09.03.20 10:37, > > > >>>>>> Jingsong Li > > > >>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danny, +1 for > > > >>> table > > > >>>>>> hints, > > > >>>>>>>>>>> thanks > > > >>>>>>>>>>>>> for > > > >>>>>>>>>>>>>> driving. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I took a look to > > > >>> FLIP, > > > >>>>>> most > > > >>>>>>>> of > > > >>>>>>>>>>>> content > > > >>>>>>>>>>>>>> are talking about > > > >>>>>>>>>>>>>>>>>>>>> query > > > >>>>>>>>>>>>>>>>>>>>>>>> hints. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> It is > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hard to > > > >> discussion > > > >>> and > > > >>>>>>>> voting. So > > > >>>>>>>>>>> +1 > > > >>>>>>>>>>>> to > > > >>>>>>>>>>>>>> split it as Jark > > > >>>>>>>>>>>>>>>>>>>>> said. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Another thing is > > > >>>>>>>> configuration that > > > >>>>>>>>>>>>>> suitable to config with > > > >>>>>>>>>>>>>>>>>>>>>> table > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> hints: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "connector.path" > > > >>> and > > > >>>>>>>>>>>> "connector.topic", > > > >>>>>>>>>>>>>> Are they really > > > >>>>>>>>>>>>>>>>>>>>>> suitable > > > >>>>>>>>>>>>>>>>>>>>>>>> for > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> table > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hints? Looks > > > >> weird > > > >>> to me. > > > >>>>>>>> Because I > > > >>>>>>>>>>>>>> think these properties > > > >>>>>>>>>>>>>>>>>>>>> are > > > >>>>>>>>>>>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> core of > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> table. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jingsong Lee > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Mar 9, > > > >>> 2020 at > > > >>>>>> 5:30 > > > >>>>>>>> PM Jark > > > >>>>>>>>>>>> Wu > > > >>>>>>>>>>>>> < > > > >>>>>>>>>>>>>> imj...@gmail.com> > > > >>>>>>>>>>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Danny > > > >> for > > > >>>>>> starting > > > >>>>>>>> the > > > >>>>>>>>>>>>>> discussion. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +1 for this > > > >>> feature. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we just > > > >> focus > > > >>> on the > > > >>>>>>>> table > > > >>>>>>>>>>> hints > > > >>>>>>>>>>>>>> not the query hints in > > > >>>>>>>>>>>>>>>>>>>>>> this > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> release, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> could you split > > > >>> the > > > >>>>>> FLIP > > > >>>>>>>> into two > > > >>>>>>>>>>>>>> FLIPs? > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Because it's > > > >>> hard to > > > >>>>>> vote > > > >>>>>>>> on > > > >>>>>>>>>>>> partial > > > >>>>>>>>>>>>>> part of a FLIP. You > > > >>>>>>>>>>>>>>>>>>>> can > > > >>>>>>>>>>>>>>>>>>>>>>> keep > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the table > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hints proposal > > > >> in > > > >>>>>> FLIP-113 > > > >>>>>>>> and > > > >>>>>>>>>>> move > > > >>>>>>>>>>>>>> query hints into > > > >>>>>>>>>>>>>>>>>>>> another > > > >>>>>>>>>>>>>>>>>>>>>>> FLIP. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So that we can > > > >>> focuse > > > >>>>>> on > > > >>>>>>>> the > > > >>>>>>>>>>> table > > > >>>>>>>>>>>>>> hints in the FLIP. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jark > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, 9 Mar > > > >>> 2020 at > > > >>>>>>>> 17:14, > > > >>>>>>>>>>> DONG, > > > >>>>>>>>>>>>>> Weike < > > > >>>>>>>>>>>>>>>>>>>>>>> kyled...@connect.hku.hk > > > >>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danny, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This is a > > > >> nice > > > >>>>>> feature, > > > >>>>>>>> +1. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> One thing I > > > >> am > > > >>>>>>>> interested in > > > >>>>>>>>>>> but > > > >>>>>>>>>>>>> not > > > >>>>>>>>>>>>>> mentioned in the > > > >>>>>>>>>>>>>>>>>>>>>> proposal > > > >>>>>>>>>>>>>>>>>>>>>>> is > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> the > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> error > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> handling, as > > > >>> it is > > > >>>>>> quite > > > >>>>>>>> common > > > >>>>>>>>>>>> for > > > >>>>>>>>>>>>>> users to write > > > >>>>>>>>>>>>>>>>>>>>>>> inappropriate > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> hints in > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SQL code, if > > > >>> illegal > > > >>>>>> or > > > >>>>>>>> "bad" > > > >>>>>>>>>>>> hints > > > >>>>>>>>>>>>>> are given, would the > > > >>>>>>>>>>>>>>>>>>>>>> system > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> simply > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ignore them > > > >> or > > > >>> throw > > > >>>>>>>>>>> exceptions? > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks : ) > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Weike > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Mar > > > >> 9, > > > >>> 2020 > > > >>>>>> at > > > >>>>>>>> 5:02 PM > > > >>>>>>>>>>>>> Danny > > > >>>>>>>>>>>>>> Chan < > > > >>>>>>>>>>>>>>>>>>>>>>> yuzhao....@gmail.com> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Note: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> we only > > > >> plan > > > >>> to > > > >>>>>>>> support table > > > >>>>>>>>>>>>>> hints in Flink release > > > >>>>>>>>>>>>>>>>>>>> 1.11, > > > >>>>>>>>>>>>>>>>>>>>>> so > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> please > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> focus > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mainly on > > > >>> the table > > > >>>>>>>> hints > > > >>>>>>>>>>> part > > > >>>>>>>>>>>>> and > > > >>>>>>>>>>>>>> just ignore the > > > >>>>>>>>>>>>>>>>>>>> planner > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> hints, sorry > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that > > > >> mistake > > > >>> ~ > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny Chan > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 在 2020年3月9日 > > > >>> +0800 > > > >>>>>>>>>>> PM4:36,Danny > > > >>>>>>>>>>>>>> Chan < > > > >>>>>>>>>>>>>>>>>>>> yuzhao....@gmail.com > > > >>>>>>>>>>>>>>>>>>>>>>>> ,写道: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi, > > > >>> fellows ~ > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would > > > >>> like to > > > >>>>>>>> propose the > > > >>>>>>>>>>>>>> supports for SQL hints for > > > >>>>>>>>>>>>>>>>>>>>> our > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Flink SQL. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We would > > > >>> support > > > >>>>>>>> hints > > > >>>>>>>>>>> syntax > > > >>>>>>>>>>>>> as > > > >>>>>>>>>>>>>> following: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> select > > > >> /*+ > > > >>>>>>>> NO_HASH_JOIN, > > > >>>>>>>>>>>>>> RESOURCE(mem='128mb', > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> parallelism='24') */ > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> emp /*+ > > > >>>>>> INDEX(idx1, > > > >>>>>>>> idx2) > > > >>>>>>>>>>> */ > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> join > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dept /*+ > > > >>>>>>>>>>> PROPERTIES(k1='v1', > > > >>>>>>>>>>>>>> k2='v2') */ > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >> emp.deptno > > > >>> = > > > >>>>>>>> dept.deptno > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Basically > > > >>> we > > > >>>>>> would > > > >>>>>>>> support > > > >>>>>>>>>>>> both > > > >>>>>>>>>>>>>> query hints(after the > > > >>>>>>>>>>>>>>>>>>>>>> SELECT > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> keyword) > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and table > > > >>>>>> hints(after > > > >>>>>>>> the > > > >>>>>>>>>>>>>> referenced table name), for > > > >>>>>>>>>>>>>>>>>>>>> 1.11, > > > >>>>>>>>>>>>>>>>>>>>>> we > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> plan to > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> only > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> support > > > >>> table hints > > > >>>>>>>> with a > > > >>>>>>>>>>> hint > > > >>>>>>>>>>>>>> probably named > > > >>>>>>>>>>>>>>>>>>>> PROPERTIES: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >> table_name > > > >>> /*+ > > > >>>>>>>>>>>>>> PROPERTIES(k1='v1', k2='v2') *+/ > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am > > > >>> looking > > > >>>>>> forward > > > >>>>>>>> to > > > >>>>>>>>>>> your > > > >>>>>>>>>>>>>> comments. > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> You can > > > >>> access > > > >>>>>> the > > > >>>>>>>> FLIP > > > >>>>>>>>>>> here: > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>> > > > >>>>>> > > > >>> > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+SQL+and+Planner+Hints > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny > > > >> Chan > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>>> > > > >>>>>>>>>>>>>> > > > >>>>>>>>>>>>> > > > >>>>>>>>>>>> > > > >>>>>>>>>>> > > > >>>>>>>>>> > > > >>>>>>>>> > > > >>>>>>>> > > > >>>>>> > > > >>>>> > > > >>>>> > > > >>>> > > > >>> > > > >> > > > > > > > > > > > > >