Thanks everyone for the feedback ~

- For the global config option belongs to `ExecutionConfigOptions` or
`OptimizerConfigOptions`, i have to strong objections, switch
to `OptimizerConfigOptions` is okey to me and i have updated the WIKI
- For use while-list or black-list, i have opinion with Timo, so black-list

I would fire a Vote if there are no other objections soon, thanks ~

Timo Walther <twal...@apache.org> 于2020年3月26日周四 下午6:31写道:

> Hi everyone,
>
> it is not only about security concerns. Hint options should be
> well-defined. We had a couple of people that were concerned about
> changing the semantics with a concept that is called "hint". These
> options are more like "debugging options" while someone is developing a
> connector or using a notebook to quickly produce some rows.
>
> The final pipeline should use a temporary table instead. I suggest to
> use a whitelist and force people to think about what should be exposed
> as a hint. By default, no option should be exposed. It is better to be
> conservative here.
>
> Regards,
> Timo
>
>
> On 26.03.20 10:31, Danny Chan wrote:
> > Thanks Kurt for the suggestion ~
> >
> > In my opinion:
> > - There is no need for TableFormatFactory#supportedHintOptions because
> all
> > the format options can be configured dynamically, they have no security
> > issues
> > - Dynamic table options is not an optimization, it is more like an
> > execution behavior from my side
> >
> > Kurt Young <ykt...@gmail.com> 于2020年3月26日周四 下午4:47写道:
> >
> >> Hi Danny,
> >>
> >> Thanks for the updates. I have 2 comments regarding to latest document:
> >>
> >> 1) I think we also need `*supportedHintOptions*` for
> >> `*TableFormatFactory*`
> >> 2) IMO "dynamic-table-options.enabled" should belong to `
> >> *OptimizerConfigOptions*`
> >>
> >> Best,
> >> Kurt
> >>
> >>
> >> On Thu, Mar 26, 2020 at 4:40 PM Timo Walther <twal...@apache.org>
> wrote:
> >>
> >>> Thanks for the update Danny. +1 for this proposal.
> >>>
> >>> Regards,
> >>> Timo
> >>>
> >>> On 26.03.20 04:51, Danny Chan wrote:
> >>>> Thanks everyone who engaged in this discussion ~
> >>>>
> >>>> Our goal is "Supports Dynamic Table Options for Flink SQL". After an
> >>>> offline discussion with Kurt, Timo and Dawid, we have made the final
> >>>> conclusion, here is the summary:
> >>>>
> >>>>
> >>>>      - Use comment style syntax to specify the dynamic table options:
> >> "/*+
> >>>>      *OPTIONS*(k1='v1', k2='v2') */"
> >>>>      - Have constraint on the options keys: the options that may bring
> >> in
> >>>>      security problems should not be allowed, i.e. Kafka connector
> >>> zookeeper
> >>>>      endpoint URL and topic name
> >>>>      - Use white-list to control the allowed options for each
> connector,
> >>>>      which is more safe for future extention
> >>>>      - We allow to enable/disable this feature globally
> >>>>      - Implement based on the current code base first, and when
> FLIP-95
> >> is
> >>>>      checked in, implement this feature based on new interface
> >>>>
> >>>> Any suggestions are appreciated ~
> >>>>
> >>>> [1]
> >>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+Supports+Dynamic+Table+Options+for+Flink+SQL
> >>>>
> >>>> Best,
> >>>> Danny Chan
> >>>>
> >>>> Jark Wu <imj...@gmail.com> 于2020年3月18日周三 下午10:38写道:
> >>>>
> >>>>> Hi everyone,
> >>>>>
> >>>>> Sorry, but I'm not sure about the `supportedHintOptions`. I'm afraid
> >> it
> >>>>> doesn't solve the problems but increases some development and
> learning
> >>>>> burdens.
> >>>>>
> >>>>> # increase development and learning burden
> >>>>>
> >>>>> According to the discussion so far, we want to support overriding a
> >>> subset
> >>>>> of options in hints which doesn't affect semantics.
> >>>>> With the `supportedHintOptions`, it's up to the connector developers
> >> to
> >>>>> decide which options will not affect semantics, and to be hint
> >> options.
> >>>>> However, the question is how to distinguish whether an option will
> >>> *affect
> >>>>> semantics*? What happens if an option will affect semantics but
> >>> provided as
> >>>>> hint options?
> >>>>>   From my point of view, it's not easy to distinguish. For example,
> the
> >>>>> "format.ignore-parse-error" can be a very useful dynamic option but
> >> that
> >>>>> will affect semantic, because the result is different (null vs
> >>> exception).
> >>>>> Another example, the "connector.lookup.cache.*" options are also very
> >>>>> useful to tune jobs, however, it will also affect the job results. I
> >> can
> >>>>> come up many more useful options but may affect semantics.
> >>>>>
> >>>>> I can see that the community will under endless discussion around
> "can
> >>> this
> >>>>> option to be a hint option?",  "wether this option will affect
> >>> semantics?".
> >>>>> You can also find that we already have different opinions on
> >>>>> "ignore-parse-error". Those discussion is a waste of time! That's not
> >>> what
> >>>>> users want!
> >>>>> The problem is user need this, this, this options and HOW to expose
> >>> them?
> >>>>> We should focus on that.
> >>>>>
> >>>>> Then there could be two endings in the future:
> >>>>> 1) compromise on the usability, we drop the rule that hints don't
> >> affect
> >>>>> semantics, allow all the useful options in the hints list.
> >>>>> 2) stick on the rule, users will find this is a stumbling feature
> >> which
> >>>>> doesn't solve their problems.
> >>>>>       And they will be surprised why this option can't be set, but
> the
> >>> other
> >>>>> could. *semantic* is hard to be understood by users.
> >>>>>
> >>>>> # doesn't solve the problems
> >>>>>
> >>>>> I think the purpose of this FLIP is to allow users to quickly
> override
> >>> some
> >>>>> connectors' properties to tune their jobs.
> >>>>> However, `supportedHintOptions` is off track. It only allows a subset
> >>>>> options and for the users it's not *clear* which subset is allowed.
> >>>>>
> >>>>> Besides, I'm not sure `supportedHintOptions` can work well for all
> >>> cases.
> >>>>> How could you support kafka properties (`connector.properties.*`) as
> >>> hint
> >>>>> options? Some kafka properties may affect semantics
> >> (bootstrap.servers),
> >>>>> some may not (max.poll.records). Besides, I think it's not possible
> to
> >>> list
> >>>>> all the possible kafka properties [1].
> >>>>>
> >>>>> In summary, IMO, `supportedHintOptions`
> >>>>> (1) it increase the complexity to develop a connector
> >>>>> (2) it confuses users which options can be used in hint, which are
> >> not,
> >>>>> they have to check the docs again and again.
> >>>>> (3) it doesn't solve the problems which we want to solve by this
> FLIP.
> >>>>>
> >>>>> I think we should avoid introducing some partial solutions.
> Otherwise,
> >>> we
> >>>>> will be stuck in a loop that introduce new API -> deprecate API ->
> >>>>> introduce new API....
> >>>>>
> >>>>> I personally in favor of an explicit WITH syntax after the table as a
> >>> part
> >>>>> of the query which is mentioned by Kurt before, e.g. SELECT * from T
> >>>>> WITH('key' = 'value') .
> >>>>> It allows users to dynamically set options which can affect
> semantics.
> >>> It
> >>>>> will be very flexible to solve users' problems so far.
> >>>>>
> >>>>> Best,
> >>>>> Jark
> >>>>>
> >>>>> [1]: https://kafka.apache.org/documentation/#consumerconfigs
> >>>>>
> >>>>> On Wed, 18 Mar 2020 at 21:44, Danny Chan <yuzhao....@gmail.com>
> >> wrote:
> >>>>>
> >>>>>> My POC is here for the hints options merge [1].
> >>>>>>
> >>>>>> Personally, I have no strong objections for splitting hints with the
> >>>>>> CatalogTable, the only cons is a more complex implementation but the
> >>>>>> concept is more clear, and I have updated the WIKI.
> >>>>>>
> >>>>>> I think it would be nice if we can support the format “ignore-parse
> >>>>> error”
> >>>>>> option key, the CSV source already has a key [2] and we can use that
> >> in
> >>>>> the
> >>>>>> supportedHIntOptions, for the common CSV and JSON formats, we cal
> >> also
> >>>>> give
> >>>>>> a support. This is the only kind of key in formats that “do not
> >> change
> >>>>> the
> >>>>>> semantics” (somehow), what do you think about this ~
> >>>>>>
> >>>>>> [1]
> >>>>>>
> >>>>>
> >>>
> >>
> https://github.com/danny0405/flink/commit/5d925fa16c3c553423c4b7d93001521b8e6e6bee#diff-6e569a6dd124fd2091c18e2790fb49c5
> >>>>>> [2]
> >>>>>>
> >>>>>
> >>>
> >>
> https://github.com/apache/flink/blob/b83060dff6d403b6994b6646b3f29a374f599530/flink-table/flink-table-api-java-bridge/src/main/java/org/apache/flink/table/sources/CsvTableSourceFactoryBase.java#L92
> >>>>>>
> >>>>>> Best,
> >>>>>> Danny Chan
> >>>>>> 在 2020年3月18日 +0800 PM9:10,Timo Walther <twal...@apache.org>,写道:
> >>>>>>> Hi everyone,
> >>>>>>>
> >>>>>>> +1 to Kurt's suggestion. Let's just have it in source and sink
> >>>>> factories
> >>>>>>> for now. We can still move this method up in the future. Currently,
> >> I
> >>>>>>> don't see a need for catalogs or formats. Because how would you
> >> target
> >>>>> a
> >>>>>>> format in the query?
> >>>>>>>
> >>>>>>> @Danny: Can you send a link to your PoC? I'm very skeptical about
> >>>>>>> creating a new CatalogTable in planner. Actually CatalogTable
> should
> >>> be
> >>>>>>> immutable between Catalog and Factory. Because a catalog can return
> >>> its
> >>>>>>> own factory and fully control the instantiation. Depending on the
> >>>>>>> implementation, that means it can be possible that the catalog has
> >>>>>>> encoded more information in a concrete subclass implementing the
> >>>>>>> interface. I vote for separating the concerns of catalog
> information
> >>>>> and
> >>>>>>> hints in the factory explicitly.
> >>>>>>>
> >>>>>>> Regards,
> >>>>>>> Timo
> >>>>>>>
> >>>>>>>
> >>>>>>> On 18.03.20 05:41, Jingsong Li wrote:
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> I am thinking we can provide hints to *table* related instances.
> >>>>>>>> - TableFormatFactory: of cause we need hints support, there are
> >> many
> >>>>>> format
> >>>>>>>> options in DDL too.
> >>>>>>>> - catalog and module: I don't know, maybe in future we can provide
> >>>>> some
> >>>>>>>> hints for them.
> >>>>>>>>
> >>>>>>>> Best,
> >>>>>>>> Jingsong Lee
> >>>>>>>>
> >>>>>>>> On Wed, Mar 18, 2020 at 12:28 PM Danny Chan <yuzhao....@gmail.com
> >
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Yes, I think we should move the `supportedHintOptions` from
> >>>>>> TableFactory
> >>>>>>>>> to TableSourceFactory, and we also need to add the interface to
> >>>>>>>>> TableSinkFactory though because sink target table may also have
> >>>>> hints
> >>>>>>>>> attached.
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Danny Chan
> >>>>>>>>> 在 2020年3月18日 +0800 AM11:08,Kurt Young <ykt...@gmail.com>,写道:
> >>>>>>>>>> Have one question for adding `supportedHintOptions` method to
> >>>>>>>>>> `TableFactory`. It seems
> >>>>>>>>>> `TableFactory` is a base factory interface for all *table
> module*
> >>>>>> related
> >>>>>>>>>> instances, such as
> >>>>>>>>>> catalog, module, format and so on. It's not created only for
> >>>>>> *table*. Is
> >>>>>>>>> it
> >>>>>>>>>> possible to move it
> >>>>>>>>>> to `TableSourceFactory`?
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Kurt
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Mar 18, 2020 at 10:59 AM Danny Chan <
> >>>>> yuzhao....@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Thanks Timo ~
> >>>>>>>>>>>
> >>>>>>>>>>> For the naming itself, I also think the PROPERTIES is not that
> >>>>>>>>> concise, so
> >>>>>>>>>>> +1 for OPTIONS (I had thought about that, but there are many
> >>>>>> codes in
> >>>>>>>>>>> current Flink called it properties, i.e. the
> >>>>>> DescriptorProperties,
> >>>>>>>>>>> #getSupportedProperties), let’s use OPTIONS if this is our new
> >>>>>>>>> preference.
> >>>>>>>>>>>
> >>>>>>>>>>> +1 to `Set<ConfigOption> supportedHintOptions()` because the
> >>>>>>>>> ConfigOption
> >>>>>>>>>>> can take more info. AFAIK, Spark also call their table options
> >>>>>> instead
> >>>>>>>>> of
> >>>>>>>>>>> properties. [1]
> >>>>>>>>>>>
> >>>>>>>>>>> In my local POC, I did create a new CatalogTable, and it works
> >>>>>> for
> >>>>>>>>> current
> >>>>>>>>>>> connectors well, all the DDL tables would finally yield a
> >>>>>> CatalogTable
> >>>>>>>>>>> instance and we can apply the options to that(in the
> >>>>>> CatalogSourceTable
> >>>>>>>>>>> when we generating the TableSource), the pros is that we do not
> >>>>>> need to
> >>>>>>>>>>> modify the codes of connectors itself. If we split the options
> >>>>>> from
> >>>>>>>>>>> CatalogTable, we may need to add some additional logic in each
> >>>>>>>>> connector
> >>>>>>>>>>> factories in order to merge these properties (and the logic are
> >>>>>> almost
> >>>>>>>>> the
> >>>>>>>>>>> same), what do you think about this?
> >>>>>>>>>>>
> >>>>>>>>>>> [1]
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> https://docs.databricks.com/spark/latest/spark-sql/language-manual/create-table.html
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Danny Chan
> >>>>>>>>>>> 在 2020年3月17日 +0800 PM10:10,Timo Walther <twal...@apache.org
> >>>>>> ,写道:
> >>>>>>>>>>>> Hi Danny,
> >>>>>>>>>>>>
> >>>>>>>>>>>> thanks for updating the FLIP. I think your current design is
> >>>>>>>>> sufficient
> >>>>>>>>>>>> to separate hints from result-related properties.
> >>>>>>>>>>>>
> >>>>>>>>>>>> One remark to the naming itself: I would vote for calling the
> >>>>>> hints
> >>>>>>>>>>>> around table scan `OPTIONS('k'='v')`. We used the term
> >>>>>> "properties"
> >>>>>>>>> in
> >>>>>>>>>>>> the past but since we want to unify the Flink configuration
> >>>>>>>>> experience,
> >>>>>>>>>>>> we should use consistent naming and classes around
> >>>>>> `ConfigOptions`.
> >>>>>>>>>>>>
> >>>>>>>>>>>> It would be nice to use `Set<ConfigOption>
> >>>>>> supportedHintOptions();`
> >>>>>>>>> to
> >>>>>>>>>>>> start using config options instead of pure string properties.
> >>>>>> This
> >>>>>>>>> will
> >>>>>>>>>>>> also allow us to generate documentation in the future around
> >>>>>>>>> supported
> >>>>>>>>>>>> data types, ranges, etc. for options. At some point we would
> >>>>>> also
> >>>>>>>>> like
> >>>>>>>>>>>> to drop `DescriptorProperties` class. "Options" is also used
> >>>>>> in the
> >>>>>>>>>>>> documentation [1] and in the SQL/MED standard [2].
> >>>>>>>>>>>>
> >>>>>>>>>>>> Furthermore, I would still vote for separating CatalogTable
> >>>>>> and hint
> >>>>>>>>>>>> options. Otherwise the planner would need to create a new
> >>>>>>>>> CatalogTable
> >>>>>>>>>>>> instance which might not always be easy. We should offer them
> >>>>>> via:
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>
> org.apache.flink.table.factories.TableSourceFactory.Context#getHints:
> >>>>>>>>>>>> ReadableConfig
> >>>>>>>>>>>>
> >>>>>>>>>>>> What do you think?
> >>>>>>>>>>>>
> >>>>>>>>>>>> Regards,
> >>>>>>>>>>>> Timo
> >>>>>>>>>>>>
> >>>>>>>>>>>> [1]
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql/create.html#create-table
> >>>>>>>>>>>> [2] https://wiki.postgresql.org/wiki/SQL/MED
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On 12.03.20 15:06, Stephan Ewen wrote:
> >>>>>>>>>>>>> @Danny sounds good.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Maybe it is worth listing all the classes of problems that
> >>>>>> you
> >>>>>>>>> want to
> >>>>>>>>>>>>> address and then look at each class and see if hints are a
> >>>>>> good
> >>>>>>>>> default
> >>>>>>>>>>>>> solution or a good optional way of simplifying things?
> >>>>>>>>>>>>> The discussion has grown a lot and it is starting to be
> >>>>> hard
> >>>>>> to
> >>>>>>>>>>> distinguish
> >>>>>>>>>>>>> the parts where everyone agrees from the parts were there
> >>>>> are
> >>>>>>>>> concerns.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Thu, Mar 12, 2020 at 2:31 PM Danny Chan <
> >>>>>> danny0...@apache.org>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks Stephan ~
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> We can remove the support for properties that may change
> >>>>>> the
> >>>>>>>>>>> semantics of
> >>>>>>>>>>>>>> query if you think that is a trouble.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> How about we support the /*+ properties() */ hint only
> >>>>> for
> >>>>>> those
> >>>>>>>>>>> optimize
> >>>>>>>>>>>>>> parameters, such as the fetch size of source or something
> >>>>>> like
> >>>>>>>>> that,
> >>>>>>>>>>> does
> >>>>>>>>>>>>>> that make sense?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Stephan Ewen <se...@apache.org>于2020年3月12日 周四下午7:45写道:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I think Bowen has actually put it very well.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> (1) Hints that change semantics looks like trouble
> >>>>>> waiting to
> >>>>>>>>>>> happen. For
> >>>>>>>>>>>>>>> example Kafka offset handling should be in filters. The
> >>>>>> Kafka
> >>>>>>>>>>> source
> >>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>> support predicate pushdown.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> (2) Hints should not be a workaround for current
> >>>>>> shortcomings.
> >>>>>>>>> A
> >>>>>>>>>>> lot of
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> suggested above sounds exactly like that. Working
> >>>>> around
> >>>>>>>>>>> catalog/DDL
> >>>>>>>>>>>>>>> shortcomings, missing exposure of metadata (offsets),
> >>>>>> missing
> >>>>>>>>>>> predicate
> >>>>>>>>>>>>>>> pushdown in Kafka. Abusing a feature like hints now as
> >>>>> a
> >>>>>> quick
> >>>>>>>>> fix
> >>>>>>>>>>> for
> >>>>>>>>>>>>>>> these issues, rather than fixing the root causes, will
> >>>>>> much
> >>>>>>>>> likely
> >>>>>>>>>>> bite
> >>>>>>>>>>>>>> us
> >>>>>>>>>>>>>>> back badly in the future.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>> Stephan
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Thu, Mar 12, 2020 at 10:43 AM Kurt Young <
> >>>>>> ykt...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> It seems this FLIP's name is somewhat misleading.
> >>>>> From
> >>>>>> my
> >>>>>>>>>>>>>> understanding,
> >>>>>>>>>>>>>>>> this FLIP is trying to
> >>>>>>>>>>>>>>>> address the dynamic parameter issue, and table hints
> >>>>>> is the
> >>>>>>>>> way
> >>>>>>>>>>> we wan
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>> choose. I think we should
> >>>>>>>>>>>>>>>> be focus on "what's the right way to solve dynamic
> >>>>>> property"
> >>>>>>>>>>> instead of
> >>>>>>>>>>>>>>>> discussing "whether table
> >>>>>>>>>>>>>>>> hints can affect query semantics".
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> For now, there are two proposed ways to achieve
> >>>>> dynamic
> >>>>>>>>> property:
> >>>>>>>>>>>>>>>> 1. FLIP-110: create temporary table xx like xx with
> >>>>>> (xxx)
> >>>>>>>>>>>>>>>> 2. use custom "from t with (xxx)" syntax
> >>>>>>>>>>>>>>>> 3. "Borrow" the table hints to have a special
> >>>>>> PROPERTIES
> >>>>>>>>> hint.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> The first one didn't break anything, but the only
> >>>>>> problem i
> >>>>>>>>> see
> >>>>>>>>>>> is a
> >>>>>>>>>>>>>>> little
> >>>>>>>>>>>>>>>> more verbose than the table hint
> >>>>>>>>>>>>>>>> approach. I can imagine when someone using SQL CLI to
> >>>>>> have a
> >>>>>>>>> sql
> >>>>>>>>>>>>>>>> experience, it's quite often that
> >>>>>>>>>>>>>>>> he will modify the table property, some use cases i
> >>>>> can
> >>>>>>>>> think of:
> >>>>>>>>>>>>>>>> 1. the source contains some corrupted data, i want to
> >>>>>> turn
> >>>>>>>>> on the
> >>>>>>>>>>>>>>>> "ignore-error" flag for certain formats.
> >>>>>>>>>>>>>>>> 2. I have a kafka table and want to see some sample
> >>>>>> data
> >>>>>>>>> from the
> >>>>>>>>>>>>>>>> beginning, so i change the offset
> >>>>>>>>>>>>>>>> to "earliest", and then I want to observe the latest
> >>>>>> data
> >>>>>>>>> which
> >>>>>>>>>>> keeps
> >>>>>>>>>>>>>>>> coming in. I would write another query
> >>>>>>>>>>>>>>>> to select from the latest table.
> >>>>>>>>>>>>>>>> 3. I want to my jdbc sink flush data more eagerly
> >>>>> then
> >>>>>> i can
> >>>>>>>>>>> observe
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> data from database side.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Most of such use cases are quite ad-hoc. If every
> >>>>> time
> >>>>>> I
> >>>>>>>>> want to
> >>>>>>>>>>> have a
> >>>>>>>>>>>>>>>> different experience, i need to create
> >>>>>>>>>>>>>>>> a temporary table and then also modify my query, it
> >>>>>> doesn't
> >>>>>>>>> feel
> >>>>>>>>>>>>>> smooth.
> >>>>>>>>>>>>>>>> Embed such dynamic property into
> >>>>>>>>>>>>>>>> query would have better user experience.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Both 2 & 3 can make this happen. The cons of #2 is
> >>>>>> breaking
> >>>>>>>>> SQL
> >>>>>>>>>>>>>>> compliant,
> >>>>>>>>>>>>>>>> and for #3, it only breaks some
> >>>>>>>>>>>>>>>> unwritten rules, but we can have an explanation on
> >>>>>> that. And
> >>>>>>>>> I
> >>>>>>>>>>> really
> >>>>>>>>>>>>>>> doubt
> >>>>>>>>>>>>>>>> whether user would complain about
> >>>>>>>>>>>>>>>> this when they actually have flexible and good
> >>>>>> experience
> >>>>>>>>> using
> >>>>>>>>>>> this.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> My tendency would be #3 > #1 > #2, what do you think?
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>> Kurt
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Thu, Mar 12, 2020 at 1:11 PM Danny Chan <
> >>>>>>>>> yuzhao....@gmail.com
> >>>>>>>>>>>>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks Aljoscha ~
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I agree for most of the query hints, they are
> >>>>>> optional as
> >>>>>>>>> an
> >>>>>>>>>>>>>> optimizer
> >>>>>>>>>>>>>>>>> instruction, especially for the traditional RDBMS.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> But, just like BenChao said, Flink as a computation
> >>>>>> engine
> >>>>>>>>> has
> >>>>>>>>>>> many
> >>>>>>>>>>>>>>>>> different kind of data sources, thus, dynamic
> >>>>>> parameters
> >>>>>>>>> like
> >>>>>>>>>>>>>>>> start_offest
> >>>>>>>>>>>>>>>>> can only bind to each table scope, we can not set a
> >>>>>> session
> >>>>>>>>>>> config
> >>>>>>>>>>>>>> like
> >>>>>>>>>>>>>>>>> KSQL because they are all about Kafka:
> >>>>>>>>>>>>>>>>>> SET ‘auto.offset.reset’=‘earliest’;
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thus the most flexible way to set up these dynamic
> >>>>>> params
> >>>>>>>>> is
> >>>>>>>>>>> to bind
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> the table scope in the query when we want to
> >>>>> override
> >>>>>>>>>>> something, so
> >>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>>> these solutions above (with pros and cons from my
> >>>>>> side):
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> • 1. Select * from t(offset=123) (from Timo)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Pros:
> >>>>>>>>>>>>>>>>> - Easy to add
> >>>>>>>>>>>>>>>>> - Parameters are part of the main query
> >>>>>>>>>>>>>>>>> Cons:
> >>>>>>>>>>>>>>>>> - Not SQL compliant
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> • 2. Select * from t /*+ PROPERTIES(offset=123) */
> >>>>>> (from
> >>>>>>>>> me)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Pros:
> >>>>>>>>>>>>>>>>> - Easy to add
> >>>>>>>>>>>>>>>>> - SQL compliant because it is nested in the
> >>>>> comments
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Cons:
> >>>>>>>>>>>>>>>>> - Parameters are not part of the main query
> >>>>>>>>>>>>>>>>> - Cryptic syntax for new users
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> The biggest problem for hints way may be the “if
> >>>>>> hints
> >>>>>>>>> must be
> >>>>>>>>>>>>>>> optional”,
> >>>>>>>>>>>>>>>>> actually we have though about 1 for a while but
> >>>>>> aborted
> >>>>>>>>>>> because it
> >>>>>>>>>>>>>>> breaks
> >>>>>>>>>>>>>>>>> the SQL standard too much. And we replace it with
> >>>>> 2,
> >>>>>>>>> because
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> hints
> >>>>>>>>>>>>>>>>> syntax do not break SQL standard(nested in
> >>>>> comments).
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> What if we have the special /*+ PROPERTIES */ hint
> >>>>>> that
> >>>>>>>>> allows
> >>>>>>>>>>>>>> override
> >>>>>>>>>>>>>>>>> some properties of table dynamically, it does not
> >>>>>> break
> >>>>>>>>>>> anything, at
> >>>>>>>>>>>>>>>> lease
> >>>>>>>>>>>>>>>>> for current Flink use cases.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Planner hints are optional just because they are
> >>>>>> naturally
> >>>>>>>>>>> enforcers
> >>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>> the planner, most of them aim to instruct the
> >>>>>> optimizer,
> >>>>>>>>> but,
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> table
> >>>>>>>>>>>>>>>>> hints is a little different, table hints can
> >>>>> specify
> >>>>>> the
> >>>>>>>>> table
> >>>>>>>>>>> meta
> >>>>>>>>>>>>>>> like
> >>>>>>>>>>>>>>>>> index column, and it is very convenient to specify
> >>>>>> table
> >>>>>>>>>>> properties.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Or shall we not call /*+ PROPERTIES(offset=123) */
> >>>>>> table
> >>>>>>>>> hint,
> >>>>>>>>>>> we
> >>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>> call it table dynamic parameters.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>> Danny Chan
> >>>>>>>>>>>>>>>>> 在 2020年3月11日 +0800 PM9:20,Aljoscha Krettek <
> >>>>>>>>>>> aljos...@apache.org>,写道:
> >>>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I don't understand this discussion. Hints, as I
> >>>>>>>>> understand
> >>>>>>>>>>> them,
> >>>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>>>> work like this:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> - hints are *optional* advice for the optimizer
> >>>>> to
> >>>>>> try
> >>>>>>>>> and
> >>>>>>>>>>> help it
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>> find a good execution strategy
> >>>>>>>>>>>>>>>>>> - hints should not change query semantics, i.e.
> >>>>>> they
> >>>>>>>>> should
> >>>>>>>>>>> not
> >>>>>>>>>>>>>>> change
> >>>>>>>>>>>>>>>>>> connector properties executing a query with
> >>>>> taking
> >>>>>> into
> >>>>>>>>>>> account the
> >>>>>>>>>>>>>>>>>> hints *must* produce the same result as executing
> >>>>>> the
> >>>>>>>>> query
> >>>>>>>>>>> without
> >>>>>>>>>>>>>>>>>> taking into account the hints
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>   From these simple requirements you can derive a
> >>>>>> solution
> >>>>>>>>>>> that makes
> >>>>>>>>>>>>>>>>>> sense. I don't have a strong preference for the
> >>>>>> syntax
> >>>>>>>>> but we
> >>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>>>> strive to be in line with prior work.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>> Aljoscha
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On 11.03.20 11:53, Danny Chan wrote:
> >>>>>>>>>>>>>>>>>>> Thanks Timo for summarize the 3 options ~
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I agree with Kurt that option2 is too
> >>>>>> complicated to
> >>>>>>>>> use
> >>>>>>>>>>> because:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> • As a Kafka topic consumer, the user must
> >>>>>> define both
> >>>>>>>>> the
> >>>>>>>>>>>>>> virtual
> >>>>>>>>>>>>>>>>> column for start offset and he must apply a special
> >>>>>> filter
> >>>>>>>>>>> predicate
> >>>>>>>>>>>>>>>> after
> >>>>>>>>>>>>>>>>> each query
> >>>>>>>>>>>>>>>>>>> • And for the internal implementation, the
> >>>>>> metadata
> >>>>>>>>> column
> >>>>>>>>>>> push
> >>>>>>>>>>>>>>> down
> >>>>>>>>>>>>>>>>> is another hard topic, each kind of message queue
> >>>>>> may have
> >>>>>>>>> its
> >>>>>>>>>>> offset
> >>>>>>>>>>>>>>>>> attribute, we need to consider the expression type
> >>>>>> for
> >>>>>>>>>>> different
> >>>>>>>>>>>>>> kind;
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> source also need to recognize the constant column
> >>>>> as
> >>>>>> a
> >>>>>>>>> config
> >>>>>>>>>>>>>>>> option(which
> >>>>>>>>>>>>>>>>> is weird because usually what we pushed down is a
> >>>>>> table
> >>>>>>>>> column)
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> For option 1 and option3, I think there is no
> >>>>>>>>> difference,
> >>>>>>>>>>> option1
> >>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>> also a hint syntax which is introduced in Sybase
> >>>>> and
> >>>>>>>>>>> referenced then
> >>>>>>>>>>>>>>>>> deprecated by MS-SQL in 199X years because of the
> >>>>>>>>>>> ambitiousness.
> >>>>>>>>>>>>>>>> Personally
> >>>>>>>>>>>>>>>>> I prefer /*+ */ style table hint than WITH keyword
> >>>>>> for
> >>>>>>>>> these
> >>>>>>>>>>> reasons:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> • We do not break the standard SQL, the hints
> >>>>> are
> >>>>>>>>> nested
> >>>>>>>>>>> in SQL
> >>>>>>>>>>>>>>>>> comments
> >>>>>>>>>>>>>>>>>>> • We do not need to introduce additional WITH
> >>>>>> keyword
> >>>>>>>>>>> which may
> >>>>>>>>>>>>>>>> appear
> >>>>>>>>>>>>>>>>> in a query if we use that because a table can be
> >>>>>>>>> referenced in
> >>>>>>>>>>> all
> >>>>>>>>>>>>>>> kinds
> >>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>> SQL contexts: INSERT/DELETE/FROM/JOIN …. That would
> >>>>>> make
> >>>>>>>>> our
> >>>>>>>>>>> sql
> >>>>>>>>>>>>>> query
> >>>>>>>>>>>>>>>>> break too much of the SQL from standard
> >>>>>>>>>>>>>>>>>>> • We would have uniform syntax for hints as
> >>>>> query
> >>>>>>>>> hint, one
> >>>>>>>>>>>>>> syntax
> >>>>>>>>>>>>>>>>> fits all and more easy to use
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> And here is the reason why we choose a uniform
> >>>>>> Oracle
> >>>>>>>>>>> style query
> >>>>>>>>>>>>>>>>> hint syntax which is addressed by Julian Hyde when
> >>>>> we
> >>>>>>>>> design
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> syntax
> >>>>>>>>>>>>>>>>> from the Calcite community:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I don’t much like the MSSQL-style syntax for
> >>>>>> table
> >>>>>>>>> hints.
> >>>>>>>>>>> It
> >>>>>>>>>>>>>> adds a
> >>>>>>>>>>>>>>>>> new use of the WITH keyword that is unrelated to
> >>>>> the
> >>>>>> use of
> >>>>>>>>>>> WITH for
> >>>>>>>>>>>>>>>>> common-table expressions.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> A historical note. Microsoft SQL Server
> >>>>>> inherited its
> >>>>>>>>> hint
> >>>>>>>>>>> syntax
> >>>>>>>>>>>>>>>> from
> >>>>>>>>>>>>>>>>> Sybase a very long time ago. (See “Transact SQL
> >>>>>>>>>>> Programming”[1], page
> >>>>>>>>>>>>>>>> 632,
> >>>>>>>>>>>>>>>>> “Optimizer hints”. The book was written in 1999,
> >>>>> and
> >>>>>> covers
> >>>>>>>>>>> Microsoft
> >>>>>>>>>>>>>>> SQL
> >>>>>>>>>>>>>>>>> Server 6.5 / 7.0 and Sybase Adaptive Server 11.5,
> >>>>>> but the
> >>>>>>>>>>> syntax very
> >>>>>>>>>>>>>>>>> likely predates Sybase 4.3, from which Microsoft
> >>>>> SQL
> >>>>>>>>> Server was
> >>>>>>>>>>>>>> forked
> >>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>> 1993.)
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Microsoft later added the WITH keyword to make
> >>>>>> it less
> >>>>>>>>>>> ambiguous,
> >>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> has now deprecated the syntax that does not use
> >>>>> WITH.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> They are forced to keep the syntax for
> >>>>> backwards
> >>>>>>>>>>> compatibility
> >>>>>>>>>>>>>> but
> >>>>>>>>>>>>>>>>> that doesn’t mean that we should shoulder their
> >>>>>> burden.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I think formatted comments are the right
> >>>>>> container for
> >>>>>>>>>>> hints
> >>>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>> it allows us to change the hint syntax without
> >>>>>> changing
> >>>>>>>>> the SQL
> >>>>>>>>>>>>>> parser,
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> makes clear that we are at liberty to ignore hints
> >>>>>>>>> entirely.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Julian
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> [1] https://www.amazon.com/s?k=9781565924017 <
> >>>>>>>>>>>>>>>>> https://www.amazon.com/s?k=9781565924017>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>> Danny Chan
> >>>>>>>>>>>>>>>>>>> 在 2020年3月11日 +0800 PM4:03,Timo Walther <
> >>>>>>>>> twal...@apache.org
> >>>>>>>>>>>> ,写道:
> >>>>>>>>>>>>>>>>>>>> Hi Danny,
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> it is true that our DDL is not standard
> >>>>>> compliant by
> >>>>>>>>>>> using the
> >>>>>>>>>>>>>>> WITH
> >>>>>>>>>>>>>>>>>>>> clause. Nevertheless, we aim for not
> >>>>> diverging
> >>>>>> too
> >>>>>>>>> much
> >>>>>>>>>>> and the
> >>>>>>>>>>>>>>>> LIKE
> >>>>>>>>>>>>>>>>>>>> clause is an example of that. It will solve
> >>>>>> things
> >>>>>>>>> like
> >>>>>>>>>>>>>>> overwriting
> >>>>>>>>>>>>>>>>>>>> WATERMARKs, add additional/modifying
> >>>>>> properties and
> >>>>>>>>>>> inherit
> >>>>>>>>>>>>>>> schema.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Bowen is right that Flink's DDL is mixing 3
> >>>>>> types
> >>>>>>>>>>> definition
> >>>>>>>>>>>>>>>>> together.
> >>>>>>>>>>>>>>>>>>>> We are not the first ones that try to solve
> >>>>>> this.
> >>>>>>>>> There
> >>>>>>>>>>> is also
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> SQL
> >>>>>>>>>>>>>>>>>>>> MED standard [1] that tried to tackle this
> >>>>>> problem. I
> >>>>>>>>>>> think it
> >>>>>>>>>>>>>>> was
> >>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>> considered when designing the current DDL.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Currently, I see 3 options for handling Kafka
> >>>>>>>>> offsets. I
> >>>>>>>>>>> will
> >>>>>>>>>>>>>>> give
> >>>>>>>>>>>>>>>>> some
> >>>>>>>>>>>>>>>>>>>> examples and look forward to feedback here:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> *Option 1* Runtime and semantic parms as part
> >>>>>> of the
> >>>>>>>>>>> query
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> `SELECT * FROM MyTable('offset'=123)`
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Pros:
> >>>>>>>>>>>>>>>>>>>> - Easy to add
> >>>>>>>>>>>>>>>>>>>> - Parameters are part of the main query
> >>>>>>>>>>>>>>>>>>>> - No complicated hinting syntax
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Cons:
> >>>>>>>>>>>>>>>>>>>> - Not SQL compliant
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> *Option 2* Use metadata in query
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> `CREATE TABLE MyTable (id INT, offset AS
> >>>>>>>>>>>>>>>> SYSTEM_METADATA('offset'))`
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> `SELECT * FROM MyTable WHERE offset >
> >>>>> TIMESTAMP
> >>>>>>>>>>> '2012-12-12
> >>>>>>>>>>>>>>>>> 12:34:22'`
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Pros:
> >>>>>>>>>>>>>>>>>>>> - SQL compliant in the query
> >>>>>>>>>>>>>>>>>>>> - Access of metadata in the DDL which is
> >>>>>> required
> >>>>>>>>> anyway
> >>>>>>>>>>>>>>>>>>>> - Regular pushdown rules apply
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Cons:
> >>>>>>>>>>>>>>>>>>>> - Users need to add an additional comlumn in
> >>>>>> the DDL
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> *Option 3*: Use hints for properties
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> `
> >>>>>>>>>>>>>>>>>>>> SELECT *
> >>>>>>>>>>>>>>>>>>>> FROM MyTable /*+ PROPERTIES('offset'=123) */
> >>>>>>>>>>>>>>>>>>>> `
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Pros:
> >>>>>>>>>>>>>>>>>>>> - Easy to add
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Cons:
> >>>>>>>>>>>>>>>>>>>> - Parameters are not part of the main query
> >>>>>>>>>>>>>>>>>>>> - Cryptic syntax for new users
> >>>>>>>>>>>>>>>>>>>> - Not standard compliant.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> If we go with this option, I would suggest to
> >>>>>> make it
> >>>>>>>>>>> available
> >>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>> separate map and don't mix it with statically
> >>>>>> defined
> >>>>>>>>>>>>>> properties.
> >>>>>>>>>>>>>>>>> Such
> >>>>>>>>>>>>>>>>>>>> that the factory can decide which properties
> >>>>>> have the
> >>>>>>>>>>> right to
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>> overwritten by the hints:
> >>>>>>>>>>>>>>>>>>>> TableSourceFactory.Context.getQueryHints():
> >>>>>>>>>>> ReadableConfig
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> [1] https://en.wikipedia.org/wiki/SQL/MED
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Currently I see 3 options as a
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On 11.03.20 07:21, Danny Chan wrote:
> >>>>>>>>>>>>>>>>>>>>> Thanks Bowen ~
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> I agree we should somehow categorize our
> >>>>>> connector
> >>>>>>>>>>>>>> parameters.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> For type1, I’m already preparing a solution
> >>>>>> like
> >>>>>>>>> the
> >>>>>>>>>>>>>> Confluent
> >>>>>>>>>>>>>>>>> schema registry + Avro schema inference thing, so
> >>>>>> this may
> >>>>>>>>> not
> >>>>>>>>>>> be a
> >>>>>>>>>>>>>>>> problem
> >>>>>>>>>>>>>>>>> in the near future.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> For type3, I have some questions:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> "SELECT * FROM mykafka WHERE offset >
> >>>>> 12pm
> >>>>>>>>> yesterday”
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Where does the offset column come from, a
> >>>>>> virtual
> >>>>>>>>>>> column from
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> table schema, you said that
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> They change
> >>>>>>>>>>>>>>>>>>>>> almost every time a query starts and have
> >>>>>> nothing
> >>>>>>>>> to
> >>>>>>>>>>> do with
> >>>>>>>>>>>>>>>>> metadata, thus
> >>>>>>>>>>>>>>>>>>>>> should not be part of table definition/DDL
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> But why you can reference it in the query,
> >>>>>> I’m
> >>>>>>>>>>> confused for
> >>>>>>>>>>>>>>> that,
> >>>>>>>>>>>>>>>>> can you elaborate a little ?
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>> Danny Chan
> >>>>>>>>>>>>>>>>>>>>> 在 2020年3月11日 +0800 PM12:52,Bowen Li <
> >>>>>>>>>>> bowenl...@gmail.com
> >>>>>>>>>>>>>>> ,写道:
> >>>>>>>>>>>>>>>>>>>>>> Thanks Danny for kicking off the effort
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> The root cause of too much manual work is
> >>>>>> Flink
> >>>>>>>>> DDL
> >>>>>>>>>>> has
> >>>>>>>>>>>>>>> mixed 3
> >>>>>>>>>>>>>>>>> types of
> >>>>>>>>>>>>>>>>>>>>>> params together and doesn't handle each
> >>>>> of
> >>>>>> them
> >>>>>>>>> very
> >>>>>>>>>>> well.
> >>>>>>>>>>>>>>>> Below
> >>>>>>>>>>>>>>>>> are how I
> >>>>>>>>>>>>>>>>>>>>>> categorize them and corresponding
> >>>>>> solutions in my
> >>>>>>>>>>> mind:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> - type 1: Metadata of external data, like
> >>>>>>>>> external
> >>>>>>>>>>>>>>>> endpoint/url,
> >>>>>>>>>>>>>>>>>>>>>> username/pwd, schemas, formats.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Such metadata are mostly already
> >>>>>> accessible in
> >>>>>>>>>>> external
> >>>>>>>>>>>>>>> system
> >>>>>>>>>>>>>>>>> as long as
> >>>>>>>>>>>>>>>>>>>>>> endpoints and credentials are provided.
> >>>>>> Flink can
> >>>>>>>>>>> get it
> >>>>>>>>>>>>>> thru
> >>>>>>>>>>>>>>>>> catalogs, but
> >>>>>>>>>>>>>>>>>>>>>> we haven't had many catalogs yet and thus
> >>>>>> Flink
> >>>>>>>>> just
> >>>>>>>>>>> hasn't
> >>>>>>>>>>>>>>>> been
> >>>>>>>>>>>>>>>>> able to
> >>>>>>>>>>>>>>>>>>>>>> leverage that. So the solution should be
> >>>>>> building
> >>>>>>>>>>> more
> >>>>>>>>>>>>>>>> catalogs.
> >>>>>>>>>>>>>>>>> Such
> >>>>>>>>>>>>>>>>>>>>>> params should be part of a Flink table
> >>>>>>>>>>> DDL/definition, and
> >>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>> overridable
> >>>>>>>>>>>>>>>>>>>>>> in any means.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> - type 2: Runtime params, like jdbc
> >>>>>> connector's
> >>>>>>>>>>> fetch size,
> >>>>>>>>>>>>>>>>> elasticsearch
> >>>>>>>>>>>>>>>>>>>>>> connector's bulk flush size.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Such params don't affect query results,
> >>>>> but
> >>>>>>>>> affect
> >>>>>>>>>>> how
> >>>>>>>>>>>>>>> results
> >>>>>>>>>>>>>>>>> are produced
> >>>>>>>>>>>>>>>>>>>>>> (eg. fast or slow, aka performance) -
> >>>>> they
> >>>>>> are
> >>>>>>>>>>> essentially
> >>>>>>>>>>>>>>>>> execution and
> >>>>>>>>>>>>>>>>>>>>>> implementation details. They change often
> >>>>>> in
> >>>>>>>>>>> exploration or
> >>>>>>>>>>>>>>>>> development
> >>>>>>>>>>>>>>>>>>>>>> stages, but not quite frequently in
> >>>>>> well-defined
> >>>>>>>>>>>>>> long-running
> >>>>>>>>>>>>>>>>> pipelines.
> >>>>>>>>>>>>>>>>>>>>>> They should always have default values
> >>>>> and
> >>>>>> can be
> >>>>>>>>>>> missing
> >>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>> query. They
> >>>>>>>>>>>>>>>>>>>>>> can be part of a table DDL/definition,
> >>>>> but
> >>>>>> should
> >>>>>>>>>>> also be
> >>>>>>>>>>>>>>>>> replaceable in a
> >>>>>>>>>>>>>>>>>>>>>> query - *this is what table "hints" in
> >>>>>> FLIP-113
> >>>>>>>>>>> should
> >>>>>>>>>>>>>>> cover*.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> - type 3: Semantic params, like kafka
> >>>>>> connector's
> >>>>>>>>>>> start
> >>>>>>>>>>>>>>> offset.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Such params affect query results - the
> >>>>>> semantics.
> >>>>>>>>>>> They'd
> >>>>>>>>>>>>>>> better
> >>>>>>>>>>>>>>>>> be as
> >>>>>>>>>>>>>>>>>>>>>> filter conditions in WHERE clause that
> >>>>> can
> >>>>>> be
> >>>>>>>>> pushed
> >>>>>>>>>>> down.
> >>>>>>>>>>>>>>> They
> >>>>>>>>>>>>>>>>> change
> >>>>>>>>>>>>>>>>>>>>>> almost every time a query starts and have
> >>>>>>>>> nothing to
> >>>>>>>>>>> do
> >>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>> metadata, thus
> >>>>>>>>>>>>>>>>>>>>>> should not be part of table
> >>>>>> definition/DDL, nor
> >>>>>>>>> be
> >>>>>>>>>>>>>> persisted
> >>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>> catalogs.
> >>>>>>>>>>>>>>>>>>>>>> If they will, users should create views
> >>>>> to
> >>>>>> keep
> >>>>>>>>> such
> >>>>>>>>>>> params
> >>>>>>>>>>>>>>>>> around (note
> >>>>>>>>>>>>>>>>>>>>>> this is different from variable
> >>>>>> substitution).
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Take Flink-Kafka as an example. Once we
> >>>>>> get these
> >>>>>>>>>>> params
> >>>>>>>>>>>>>>> right,
> >>>>>>>>>>>>>>>>> here're the
> >>>>>>>>>>>>>>>>>>>>>> steps users need to do to develop and run
> >>>>>> a Flink
> >>>>>>>>>>> job:
> >>>>>>>>>>>>>>>>>>>>>> - configure a Flink
> >>>>>> ConfluentSchemaRegistry with
> >>>>>>>>> url,
> >>>>>>>>>>>>>>> username,
> >>>>>>>>>>>>>>>>> and password
> >>>>>>>>>>>>>>>>>>>>>> - run "SELECT * FROM mykafka WHERE offset
> >>>>>>> 12pm
> >>>>>>>>>>> yesterday"
> >>>>>>>>>>>>>>>>> (simplified
> >>>>>>>>>>>>>>>>>>>>>> timestamp) in SQL CLI, Flink
> >>>>> automatically
> >>>>>>>>> retrieves
> >>>>>>>>>>> all
> >>>>>>>>>>>>>>>>> metadata of
> >>>>>>>>>>>>>>>>>>>>>> schema, file format, etc and start the
> >>>>> job
> >>>>>>>>>>>>>>>>>>>>>> - users want to make the job read Kafka
> >>>>>> topic
> >>>>>>>>>>> faster, so it
> >>>>>>>>>>>>>>>> goes
> >>>>>>>>>>>>>>>>> as "SELECT
> >>>>>>>>>>>>>>>>>>>>>> * FROM mykafka /* faster_read_key=value*/
> >>>>>> WHERE
> >>>>>>>>>>> offset >
> >>>>>>>>>>>>>> 12pm
> >>>>>>>>>>>>>>>>> yesterday"
> >>>>>>>>>>>>>>>>>>>>>> - done and satisfied, users submit it to
> >>>>>>>>> production
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Regarding "CREATE TABLE t LIKE with
> >>>>> (k1=v1,
> >>>>>>>>> k2=v2),
> >>>>>>>>>>> I think
> >>>>>>>>>>>>>>>> it's
> >>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>>>> nice-to-have feature, but not a
> >>>>>> strategically
> >>>>>>>>>>> critical,
> >>>>>>>>>>>>>>>>> long-term solution,
> >>>>>>>>>>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>>>>>>> 1) It may seem promising at the current
> >>>>>> stage to
> >>>>>>>>>>> solve the
> >>>>>>>>>>>>>>>>>>>>>> too-much-manual-work problem, but that's
> >>>>>> only
> >>>>>>>>>>> because Flink
> >>>>>>>>>>>>>>>>> hasn't
> >>>>>>>>>>>>>>>>>>>>>> leveraged catalogs well and handled the 3
> >>>>>> types
> >>>>>>>>> of
> >>>>>>>>>>> params
> >>>>>>>>>>>>>>> above
> >>>>>>>>>>>>>>>>> properly.
> >>>>>>>>>>>>>>>>>>>>>> Once we get the params types right, the
> >>>>>> LIKE
> >>>>>>>>> syntax
> >>>>>>>>>>> won't
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>> important, and will be just an easier way
> >>>>>> to
> >>>>>>>>> create
> >>>>>>>>>>> tables
> >>>>>>>>>>>>>>>>> without retyping
> >>>>>>>>>>>>>>>>>>>>>> long fields like username and pwd.
> >>>>>>>>>>>>>>>>>>>>>> 2) Note that only some rare type of
> >>>>>> catalog can
> >>>>>>>>>>> store k-v
> >>>>>>>>>>>>>>>>> property pair, so
> >>>>>>>>>>>>>>>>>>>>>> table created this way often cannot be
> >>>>>>>>> persisted. In
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> foreseeable
> >>>>>>>>>>>>>>>>>>>>>> future, such catalog will only be
> >>>>>> HiveCatalog,
> >>>>>>>>> and
> >>>>>>>>>>> not
> >>>>>>>>>>>>>>> everyone
> >>>>>>>>>>>>>>>>> has a Hive
> >>>>>>>>>>>>>>>>>>>>>> metastore. To be honest, without
> >>>>>> persistence,
> >>>>>>>>>>> recreating
> >>>>>>>>>>>>>>> tables
> >>>>>>>>>>>>>>>>> every time
> >>>>>>>>>>>>>>>>>>>>>> this way is still a lot of keyboard
> >>>>> typing.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>>>>> Bowen
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Tue, Mar 10, 2020 at 8:07 PM Kurt
> >>>>> Young
> >>>>>> <
> >>>>>>>>>>>>>> ykt...@gmail.com
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> If a specific connector want to have
> >>>>> such
> >>>>>>>>>>> parameter and
> >>>>>>>>>>>>>>> read
> >>>>>>>>>>>>>>>>> if out of
> >>>>>>>>>>>>>>>>>>>>>>> configuration, then that's fine.
> >>>>>>>>>>>>>>>>>>>>>>> If we are talking about a configuration
> >>>>>> for all
> >>>>>>>>>>> kinds of
> >>>>>>>>>>>>>>>>> sources, I would
> >>>>>>>>>>>>>>>>>>>>>>> be super careful about that.
> >>>>>>>>>>>>>>>>>>>>>>> It's true it can solve maybe 80% cases,
> >>>>>> but it
> >>>>>>>>>>> will also
> >>>>>>>>>>>>>>> make
> >>>>>>>>>>>>>>>>> the left 20%
> >>>>>>>>>>>>>>>>>>>>>>> feels weird.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>> Kurt
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> On Wed, Mar 11, 2020 at 11:00 AM Jark
> >>>>> Wu
> >>>>>> <
> >>>>>>>>>>>>>> imj...@gmail.com
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Hi Kurt,
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> #3 Regarding to global offset:
> >>>>>>>>>>>>>>>>>>>>>>>> I'm not saying to use the global
> >>>>>>>>> configuration to
> >>>>>>>>>>>>>>> override
> >>>>>>>>>>>>>>>>> connector
> >>>>>>>>>>>>>>>>>>>>>>>> properties by the planner.
> >>>>>>>>>>>>>>>>>>>>>>>> But the connector should take this
> >>>>>>>>> configuration
> >>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> translate into their
> >>>>>>>>>>>>>>>>>>>>>>>> client API.
> >>>>>>>>>>>>>>>>>>>>>>>> AFAIK, almost all the message queues
> >>>>>> support
> >>>>>>>>>>> eariliest
> >>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> latest and a
> >>>>>>>>>>>>>>>>>>>>>>>> timestamp value as start point.
> >>>>>>>>>>>>>>>>>>>>>>>> So we can support 3 options for this
> >>>>>>>>>>> configuration:
> >>>>>>>>>>>>>>>>> "eariliest", "latest"
> >>>>>>>>>>>>>>>>>>>>>>>> and a timestamp string value.
> >>>>>>>>>>>>>>>>>>>>>>>> Of course, this can't solve 100%
> >>>>>> cases, but I
> >>>>>>>>>>> guess can
> >>>>>>>>>>>>>>>>> sovle 80% or 90%
> >>>>>>>>>>>>>>>>>>>>>>>> cases.
> >>>>>>>>>>>>>>>>>>>>>>>> And the remaining cases can be
> >>>>>> resolved by
> >>>>>>>>> LIKE
> >>>>>>>>>>> syntax
> >>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>> I guess is
> >>>>>>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>>> very common cases.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>> Jark
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> On Wed, 11 Mar 2020 at 10:33, Kurt
> >>>>>> Young <
> >>>>>>>>>>>>>>> ykt...@gmail.com
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Good to have such lovely
> >>>>>> discussions. I
> >>>>>>>>> also
> >>>>>>>>>>> want to
> >>>>>>>>>>>>>>>> share
> >>>>>>>>>>>>>>>>> some of my
> >>>>>>>>>>>>>>>>>>>>>>>>> opinions.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> #1 Regarding to error handling: I
> >>>>>> also
> >>>>>>>>> think
> >>>>>>>>>>> ignore
> >>>>>>>>>>>>>>>>> invalid hints would
> >>>>>>>>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>>>>>> dangerous, maybe
> >>>>>>>>>>>>>>>>>>>>>>>>> the simplest solution is just throw
> >>>>>> an
> >>>>>>>>>>> exception.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> #2 Regarding to property
> >>>>>> replacement: I
> >>>>>>>>> don't
> >>>>>>>>>>> think
> >>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>>>>>>>>> constraint
> >>>>>>>>>>>>>>>>>>>>>>>>> ourself to
> >>>>>>>>>>>>>>>>>>>>>>>>> the meaning of the word "hint", and
> >>>>>>>>> forbidden
> >>>>>>>>>>> it
> >>>>>>>>>>>>>>>> modifying
> >>>>>>>>>>>>>>>>> any
> >>>>>>>>>>>>>>>>>>>>>>> properties
> >>>>>>>>>>>>>>>>>>>>>>>>> which can effect
> >>>>>>>>>>>>>>>>>>>>>>>>> query results. IMO `PROPERTIES` is
> >>>>>> one of
> >>>>>>>>> the
> >>>>>>>>>>> table
> >>>>>>>>>>>>>>>> hints,
> >>>>>>>>>>>>>>>>> and a
> >>>>>>>>>>>>>>>>>>>>>>> powerful
> >>>>>>>>>>>>>>>>>>>>>>>>> one. It can
> >>>>>>>>>>>>>>>>>>>>>>>>> modify properties located in DDL's
> >>>>>> WITH
> >>>>>>>>> block.
> >>>>>>>>>>> But I
> >>>>>>>>>>>>>>> also
> >>>>>>>>>>>>>>>>> see the harm
> >>>>>>>>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>>>>> if we make it
> >>>>>>>>>>>>>>>>>>>>>>>>> too flexible like change the kafka
> >>>>>> topic
> >>>>>>>>> name
> >>>>>>>>>>> with a
> >>>>>>>>>>>>>>>> hint.
> >>>>>>>>>>>>>>>>> Such use
> >>>>>>>>>>>>>>>>>>>>>>> case
> >>>>>>>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>>>>>> not common and
> >>>>>>>>>>>>>>>>>>>>>>>>> sounds very dangerous to me. I
> >>>>> would
> >>>>>>>>> propose
> >>>>>>>>>>> we have
> >>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>> map
> >>>>>>>>>>>>>>>>> of hintable
> >>>>>>>>>>>>>>>>>>>>>>>>> properties for each
> >>>>>>>>>>>>>>>>>>>>>>>>> connector, and should validate all
> >>>>>> passed
> >>>>>>>>> in
> >>>>>>>>>>>>>> properties
> >>>>>>>>>>>>>>>>> are actually
> >>>>>>>>>>>>>>>>>>>>>>>>> hintable. And combining with
> >>>>>>>>>>>>>>>>>>>>>>>>> #1 error handling, we can throw an
> >>>>>>>>> exception
> >>>>>>>>>>> once
> >>>>>>>>>>>>>>>> received
> >>>>>>>>>>>>>>>>> invalid
> >>>>>>>>>>>>>>>>>>>>>>>>> property.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> #3 Regarding to global offset: I'm
> >>>>>> not sure
> >>>>>>>>>>> it's
> >>>>>>>>>>>>>>>> feasible.
> >>>>>>>>>>>>>>>>> Different
> >>>>>>>>>>>>>>>>>>>>>>>>> connectors will have totally
> >>>>>>>>>>>>>>>>>>>>>>>>> different properties to represent
> >>>>>> offset,
> >>>>>>>>> some
> >>>>>>>>>>> might
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>> timestamps,
> >>>>>>>>>>>>>>>>>>>>>>> some
> >>>>>>>>>>>>>>>>>>>>>>>>> might be string literals
> >>>>>>>>>>>>>>>>>>>>>>>>> like "earliest", and others might
> >>>>> be
> >>>>>> just
> >>>>>>>>>>> integers.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>> Kurt
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Mar 10, 2020 at 11:46 PM
> >>>>>> Jark Wu <
> >>>>>>>>>>>>>>>> imj...@gmail.com>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> I want to jump in the discussion
> >>>>>> about
> >>>>>>>>> the
> >>>>>>>>>>> "dynamic
> >>>>>>>>>>>>>>>>> start offset"
> >>>>>>>>>>>>>>>>>>>>>>>>> problem.
> >>>>>>>>>>>>>>>>>>>>>>>>>> First of all, I share the same
> >>>>>> concern
> >>>>>>>>> with
> >>>>>>>>>>> Timo
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> Fabian, that the
> >>>>>>>>>>>>>>>>>>>>>>>>>> "start offset" affects the query
> >>>>>>>>> semantics,
> >>>>>>>>>>> i.e.
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> query result.
> >>>>>>>>>>>>>>>>>>>>>>>>>> But "hints" is just used for
> >>>>>> optimization
> >>>>>>>>>>> which
> >>>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>>> affect the
> >>>>>>>>>>>>>>>>>>>>>>>> result?
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> I think the "dynamic start
> >>>>> offset"
> >>>>>> is an
> >>>>>>>>> very
> >>>>>>>>>>>>>>> important
> >>>>>>>>>>>>>>>>> usability
> >>>>>>>>>>>>>>>>>>>>>>>> problem
> >>>>>>>>>>>>>>>>>>>>>>>>>> which will be faced by many
> >>>>>> streaming
> >>>>>>>>>>> platforms.
> >>>>>>>>>>>>>>>>>>>>>>>>>> I also agree "CREATE TEMPORARY
> >>>>>> TABLE Temp
> >>>>>>>>>>> (LIKE t)
> >>>>>>>>>>>>>>> WITH
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>> ('connector.startup-timestamp-millis' =
> >>>>>>>>>>>>>>>>> '1578538374471')" is verbose,
> >>>>>>>>>>>>>>>>>>>>>>>>> what
> >>>>>>>>>>>>>>>>>>>>>>>>>> if we have 10 tables to join?
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> However, what I want to propose
> >>>>>> (should
> >>>>>>>>> be
> >>>>>>>>>>> another
> >>>>>>>>>>>>>>>>> thread) is a
> >>>>>>>>>>>>>>>>>>>>>>> global
> >>>>>>>>>>>>>>>>>>>>>>>>>> configuration to reset start
> >>>>>> offsets of
> >>>>>>>>> all
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>> source
> >>>>>>>>>>>>>>>>> connectors
> >>>>>>>>>>>>>>>>>>>>>>>>>> in the query session, e.g.
> >>>>>>>>>>>>>>>> "table.sources.start-offset".
> >>>>>>>>>>>>>>>>> This is
> >>>>>>>>>>>>>>>>>>>>>>>> possible
> >>>>>>>>>>>>>>>>>>>>>>>>>> now because
> >>>>>> `TableSourceFactory.Context`
> >>>>>>>>> has
> >>>>>>>>>>>>>>>>> `getConfiguration`
> >>>>>>>>>>>>>>>>>>>>>>>>>> method to get the session
> >>>>>> configuration,
> >>>>>>>>> and
> >>>>>>>>>>> use it
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> create an
> >>>>>>>>>>>>>>>>>>>>>>>> adapted
> >>>>>>>>>>>>>>>>>>>>>>>>>> TableSource.
> >>>>>>>>>>>>>>>>>>>>>>>>>> Then we can also expose to SQL
> >>>>> CLI
> >>>>>> via
> >>>>>>>>> SET
> >>>>>>>>>>> command,
> >>>>>>>>>>>>>>>> e.g.
> >>>>>>>>>>>>>>>>> `SET
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>> 'table.sources.start-offset'='earliest';`,
> >>>>>>>>>>> which is
> >>>>>>>>>>>>>>>>> pretty simple and
> >>>>>>>>>>>>>>>>>>>>>>>>>> straightforward.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> This is very similar to KSQL's
> >>>>> `SET
> >>>>>>>>>>>>>>>>> 'auto.offset.reset'='earliest'`
> >>>>>>>>>>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>>>>>>>>> is very helpful IMO.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>> Jark
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, 10 Mar 2020 at 22:29,
> >>>>> Timo
> >>>>>>>>> Walther <
> >>>>>>>>>>>>>>>>> twal...@apache.org>
> >>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danny,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> compared to the hints, FLIP-110
> >>>>>> is
> >>>>>>>>> fully
> >>>>>>>>>>>>>> compliant
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> the SQL
> >>>>>>>>>>>>>>>>>>>>>>>> standard.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> I don't think that `CREATE
> >>>>>> TEMPORARY
> >>>>>>>>> TABLE
> >>>>>>>>>>> Temp
> >>>>>>>>>>>>>>> (LIKE
> >>>>>>>>>>>>>>>>> t) WITH
> >>>>>>>>>>>>>>>>>>>>>>> (k=v)`
> >>>>>>>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>>>>>>>> too verbose or awkward for the
> >>>>>> power of
> >>>>>>>>>>> basically
> >>>>>>>>>>>>>>>>> changing the
> >>>>>>>>>>>>>>>>>>>>>>> entire
> >>>>>>>>>>>>>>>>>>>>>>>>>>> connector. Usually, this
> >>>>>> statement
> >>>>>>>>> would
> >>>>>>>>>>> just
> >>>>>>>>>>>>>>> precede
> >>>>>>>>>>>>>>>>> the query in
> >>>>>>>>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>>>>>>>>> multiline file. So it can be
> >>>>>> change
> >>>>>>>>>>> "in-place"
> >>>>>>>>>>>>>> like
> >>>>>>>>>>>>>>>>> the hints you
> >>>>>>>>>>>>>>>>>>>>>>>>>> proposed.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Many companies have a
> >>>>>> well-defined set
> >>>>>>>>> of
> >>>>>>>>>>> tables
> >>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>> should be
> >>>>>>>>>>>>>>>>>>>>>>> used.
> >>>>>>>>>>>>>>>>>>>>>>>>> It
> >>>>>>>>>>>>>>>>>>>>>>>>>>> would be dangerous if users can
> >>>>>> change
> >>>>>>>>> the
> >>>>>>>>>>> path
> >>>>>>>>>>>>>> or
> >>>>>>>>>>>>>>>>> topic in a hint.
> >>>>>>>>>>>>>>>>>>>>>>>> The
> >>>>>>>>>>>>>>>>>>>>>>>>>>> catalog/catalog manager should
> >>>>>> be the
> >>>>>>>>>>> entity that
> >>>>>>>>>>>>>>>>> controls which
> >>>>>>>>>>>>>>>>>>>>>>>> tables
> >>>>>>>>>>>>>>>>>>>>>>>>>>> exist and how they can be
> >>>>>> accessed.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> what’s the problem there if
> >>>>> we
> >>>>>> user
> >>>>>>>>> the
> >>>>>>>>>>> table
> >>>>>>>>>>>>>>> hints
> >>>>>>>>>>>>>>>>> to support
> >>>>>>>>>>>>>>>>>>>>>>>>> “start
> >>>>>>>>>>>>>>>>>>>>>>>>>>> offset”?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> IMHO it violates the meaning of
> >>>>>> a hint.
> >>>>>>>>>>> According
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>> dictionary,
> >>>>>>>>>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>>>>>>>>> hint is "a statement that
> >>>>>> expresses
> >>>>>>>>>>> indirectly
> >>>>>>>>>>>>>> what
> >>>>>>>>>>>>>>>>> one prefers not
> >>>>>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>>> say explicitly". But offsets
> >>>>> are
> >>>>>> a
> >>>>>>>>>>> property that
> >>>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>> very explicit.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> If we go with the hint
> >>>>> approach,
> >>>>>> it
> >>>>>>>>> should
> >>>>>>>>>>> be
> >>>>>>>>>>>>>>>>> expressible in the
> >>>>>>>>>>>>>>>>>>>>>>>>>>> TableSourceFactory which
> >>>>>> properties are
> >>>>>>>>>>> supported
> >>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>> hinting. Or
> >>>>>>>>>>>>>>>>>>>>>>> do
> >>>>>>>>>>>>>>>>>>>>>>>>> you
> >>>>>>>>>>>>>>>>>>>>>>>>>>> plan to offer those hints in a
> >>>>>> separate
> >>>>>>>>>>>>>> Map<String,
> >>>>>>>>>>>>>>>>> String> that
> >>>>>>>>>>>>>>>>>>>>>>>> cannot
> >>>>>>>>>>>>>>>>>>>>>>>>>>> overwrite existing properties?
> >>>>> I
> >>>>>> think
> >>>>>>>>>>> this would
> >>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>> different
> >>>>>>>>>>>>>>>>>>>>>>>>> story...
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> On 10.03.20 10:34, Danny Chan
> >>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Timo ~
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Personally I would say that
> >>>>>> offset >
> >>>>>>>>> 0
> >>>>>>>>>>> and
> >>>>>>>>>>>>>> start
> >>>>>>>>>>>>>>>>> offset = 10 does
> >>>>>>>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>>>>>> have the same semantic, so from
> >>>>>> the SQL
> >>>>>>>>>>> aspect,
> >>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>> implement
> >>>>>>>>>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>>>>>>>>> “starting offset” hint for
> >>>>> query
> >>>>>> with
> >>>>>>>>> such
> >>>>>>>>>>> a
> >>>>>>>>>>>>>>> syntax.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> And the CREATE TABLE LIKE
> >>>>>> syntax is a
> >>>>>>>>>>> DDL which
> >>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>> just verbose
> >>>>>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>>>>> defining such dynamic
> >>>>> parameters
> >>>>>> even
> >>>>>>>>> if
> >>>>>>>>>>> it could
> >>>>>>>>>>>>>>> do
> >>>>>>>>>>>>>>>>> that, shall we
> >>>>>>>>>>>>>>>>>>>>>>>>> force
> >>>>>>>>>>>>>>>>>>>>>>>>>>> users to define a temporal
> >>>>> table
> >>>>>> for
> >>>>>>>>> each
> >>>>>>>>>>> query
> >>>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>> dynamic
> >>>>>>>>>>>>>>>>>>>>>>> params,
> >>>>>>>>>>>>>>>>>>>>>>>> I
> >>>>>>>>>>>>>>>>>>>>>>>>>>> would say it’s an awkward
> >>>>>> solution.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> "Hints should give "hints"
> >>>>> but
> >>>>>> not
> >>>>>>>>>>> affect the
> >>>>>>>>>>>>>>>> actual
> >>>>>>>>>>>>>>>>> produced
> >>>>>>>>>>>>>>>>>>>>>>>>> result.”
> >>>>>>>>>>>>>>>>>>>>>>>>>>> You mentioned that multiple
> >>>>>> times and
> >>>>>>>>>>> could we
> >>>>>>>>>>>>>>> give a
> >>>>>>>>>>>>>>>>> reason,
> >>>>>>>>>>>>>>>>>>>>>>> what’s
> >>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>> problem there if we user the
> >>>>>> table
> >>>>>>>>> hints to
> >>>>>>>>>>>>>> support
> >>>>>>>>>>>>>>>>> “start offset”
> >>>>>>>>>>>>>>>>>>>>>>> ?
> >>>>>>>>>>>>>>>>>>>>>>>>> From
> >>>>>>>>>>>>>>>>>>>>>>>>>>> my side I saw some benefits for
> >>>>>> that:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> • It’s very convent to set up
> >>>>>> these
> >>>>>>>>>>> parameters,
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> syntax is
> >>>>>>>>>>>>>>>>>>>>>>> very
> >>>>>>>>>>>>>>>>>>>>>>>>> much
> >>>>>>>>>>>>>>>>>>>>>>>>>>> like the DDL definition
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> • It’s scope is very clear,
> >>>>>> right on
> >>>>>>>>> the
> >>>>>>>>>>> table
> >>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>> attathed
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> • It does not affect the
> >>>>> table
> >>>>>>>>> schema,
> >>>>>>>>>>> which
> >>>>>>>>>>>>>>> means
> >>>>>>>>>>>>>>>>> in order to
> >>>>>>>>>>>>>>>>>>>>>>>>> specify
> >>>>>>>>>>>>>>>>>>>>>>>>>>> the offset, there is no need to
> >>>>>> define
> >>>>>>>>> an
> >>>>>>>>>>> offset
> >>>>>>>>>>>>>>>>> column which is
> >>>>>>>>>>>>>>>>>>>>>>>> weird
> >>>>>>>>>>>>>>>>>>>>>>>>>>> actually, offset should never
> >>>>> be
> >>>>>> a
> >>>>>>>>> column,
> >>>>>>>>>>> it’s
> >>>>>>>>>>>>>>> more
> >>>>>>>>>>>>>>>>> like a
> >>>>>>>>>>>>>>>>>>>>>>> metadata
> >>>>>>>>>>>>>>>>>>>>>>>>> or a
> >>>>>>>>>>>>>>>>>>>>>>>>>>> start option.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> So in total, FLIP-110 uses
> >>>>> the
> >>>>>> offset
> >>>>>>>>>>> more
> >>>>>>>>>>>>>> like a
> >>>>>>>>>>>>>>>>> Hive partition
> >>>>>>>>>>>>>>>>>>>>>>>>> prune,
> >>>>>>>>>>>>>>>>>>>>>>>>>>> we can do that if we have an
> >>>>>> offset
> >>>>>>>>>>> column, but
> >>>>>>>>>>>>>>> most
> >>>>>>>>>>>>>>>>> of the case we
> >>>>>>>>>>>>>>>>>>>>>>>> do
> >>>>>>>>>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>>>>>> define that, so there is
> >>>>>> actually no
> >>>>>>>>>>> conflict or
> >>>>>>>>>>>>>>>>> overlap.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny Chan
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> 在 2020年3月10日 +0800
> >>>>> PM4:28,Timo
> >>>>>>>>> Walther <
> >>>>>>>>>>>>>>>>> twal...@apache.org>,写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danny,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> shouldn't FLIP-110[1] solve
> >>>>>> most
> >>>>>>>>> of the
> >>>>>>>>>>>>>>> problems
> >>>>>>>>>>>>>>>>> we have around
> >>>>>>>>>>>>>>>>>>>>>>>>>> defining
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> table properties more
> >>>>>> dynamically
> >>>>>>>>>>> without
> >>>>>>>>>>>>>>> manual
> >>>>>>>>>>>>>>>>> schema work?
> >>>>>>>>>>>>>>>>>>>>>>> Also
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> offset definition is easier
> >>>>>> with
> >>>>>>>>> such a
> >>>>>>>>>>>>>> syntax.
> >>>>>>>>>>>>>>>>> They must not be
> >>>>>>>>>>>>>>>>>>>>>>>>>> defined
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> in catalog but could be
> >>>>>> temporary
> >>>>>>>>>>> tables that
> >>>>>>>>>>>>>>>>> extend from the
> >>>>>>>>>>>>>>>>>>>>>>>>> original
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> table.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> In general, we should aim
> >>>>> to
> >>>>>> keep
> >>>>>>>>> the
> >>>>>>>>>>> syntax
> >>>>>>>>>>>>>>>>> concise and don't
> >>>>>>>>>>>>>>>>>>>>>>>>> provide
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> too many ways of doing the
> >>>>>> same
> >>>>>>>>> thing.
> >>>>>>>>>>> Hints
> >>>>>>>>>>>>>>>>> should give "hints"
> >>>>>>>>>>>>>>>>>>>>>>>> but
> >>>>>>>>>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> affect the actual produced
> >>>>>> result.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Some connector properties
> >>>>>> might
> >>>>>>>>> also
> >>>>>>>>>>> change
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> plan or schema
> >>>>>>>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> future. E.g. they might
> >>>>> also
> >>>>>> define
> >>>>>>>>>>> whether a
> >>>>>>>>>>>>>>>>> table source
> >>>>>>>>>>>>>>>>>>>>>>>> supports
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> certain push-downs (e.g.
> >>>>>> predicate
> >>>>>>>>>>>>>> push-down).
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Dawid is currently working
> >>>>> a
> >>>>>> draft
> >>>>>>>>>>> that might
> >>>>>>>>>>>>>>>>> makes it possible
> >>>>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> expose a Kafka offset via
> >>>>> the
> >>>>>>>>> schema
> >>>>>>>>>>> such
> >>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>> `SELECT * FROM
> >>>>>>>>>>>>>>>>>>>>>>>> Topic
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> WHERE offset > 10` would
> >>>>>> become
> >>>>>>>>>>> possible and
> >>>>>>>>>>>>>>>> could
> >>>>>>>>>>>>>>>>> be pushed
> >>>>>>>>>>>>>>>>>>>>>>> down.
> >>>>>>>>>>>>>>>>>>>>>>>>> But
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> this is of course, not
> >>>>>> planned
> >>>>>>>>>>> initially.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 10.03.20 08:34, Danny
> >>>>> Chan
> >>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Wenlong ~
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For PROPERTIES Hint Error
> >>>>>>>>> handling
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Actually we have no way
> >>>>> to
> >>>>>>>>> figure out
> >>>>>>>>>>>>>>> whether a
> >>>>>>>>>>>>>>>>> error prone
> >>>>>>>>>>>>>>>>>>>>>>> hint
> >>>>>>>>>>>>>>>>>>>>>>>>> is a
> >>>>>>>>>>>>>>>>>>>>>>>>>>> PROPERTIES hint, for example,
> >>>>> if
> >>>>>> use
> >>>>>>>>>>> writes a
> >>>>>>>>>>>>>> hint
> >>>>>>>>>>>>>>>> like
> >>>>>>>>>>>>>>>>>>>>>>> ‘PROPERTIAS’,
> >>>>>>>>>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>>>>>> do
> >>>>>>>>>>>>>>>>>>>>>>>>>>> not know if this hint is a
> >>>>>> PROPERTIES
> >>>>>>>>>>> hint, what
> >>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>> know is that
> >>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> hint
> >>>>>>>>>>>>>>>>>>>>>>>>>>> name was not registered in our
> >>>>>> Flink.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If the user writes the
> >>>>>> hint name
> >>>>>>>>>>> correctly
> >>>>>>>>>>>>>>>> (i.e.
> >>>>>>>>>>>>>>>>> PROPERTIES),
> >>>>>>>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>>>>> did
> >>>>>>>>>>>>>>>>>>>>>>>>>>> can enforce the validation of
> >>>>>> the hint
> >>>>>>>>>>> options
> >>>>>>>>>>>>>>> though
> >>>>>>>>>>>>>>>>> the pluggable
> >>>>>>>>>>>>>>>>>>>>>>>>>>> HintOptionChecker.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For PROPERTIES Hint
> >>>>> Option
> >>>>>> Format
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For a key value style
> >>>>> hint
> >>>>>>>>> option,
> >>>>>>>>>>> the key
> >>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>> be either a
> >>>>>>>>>>>>>>>>>>>>>>> simple
> >>>>>>>>>>>>>>>>>>>>>>>>>>> identifier or a string literal,
> >>>>>> which
> >>>>>>>>>>> means that
> >>>>>>>>>>>>>>> it’s
> >>>>>>>>>>>>>>>>> compatible
> >>>>>>>>>>>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>>>>>>>>>> our
> >>>>>>>>>>>>>>>>>>>>>>>>>>> DDL syntax. We support simple
> >>>>>>>>> identifier
> >>>>>>>>>>> because
> >>>>>>>>>>>>>>> many
> >>>>>>>>>>>>>>>>> other hints
> >>>>>>>>>>>>>>>>>>>>>>> do
> >>>>>>>>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>>>>>> have the component complex keys
> >>>>>> like
> >>>>>>>>> the
> >>>>>>>>>>> table
> >>>>>>>>>>>>>>>>> properties, and we
> >>>>>>>>>>>>>>>>>>>>>>>> want
> >>>>>>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>>> unify the parse block.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny Chan
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 在 2020年3月10日 +0800
> >>>>>>>>>>> PM3:19,wenlong.lwl <
> >>>>>>>>>>>>>>>>> wenlong88....@gmail.com
> >>>>>>>>>>>>>>>>>>>>>>>>>> ,写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danny, thanks for
> >>>>> the
> >>>>>>>>> proposal.
> >>>>>>>>>>> +1 for
> >>>>>>>>>>>>>>>>> adding table hints,
> >>>>>>>>>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>>>>>>>> really
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a necessary feature for
> >>>>>> flink
> >>>>>>>>> sql
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>> integrate
> >>>>>>>>>>>>>>>>> with a catalog.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For error handling, I
> >>>>>> think it
> >>>>>>>>>>> would be
> >>>>>>>>>>>>>>> more
> >>>>>>>>>>>>>>>>> natural to throw
> >>>>>>>>>>>>>>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> exception when error
> >>>>>> table hint
> >>>>>>>>>>> provided,
> >>>>>>>>>>>>>>>>> because the
> >>>>>>>>>>>>>>>>>>>>>>> properties
> >>>>>>>>>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>>>>>>>>> hint
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> will be merged and used
> >>>>>> to find
> >>>>>>>>>>> the table
> >>>>>>>>>>>>>>>>> factory which would
> >>>>>>>>>>>>>>>>>>>>>>>>> cause
> >>>>>>>>>>>>>>>>>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> exception when error
> >>>>>> properties
> >>>>>>>>>>> provided,
> >>>>>>>>>>>>>>>>> right? On the other
> >>>>>>>>>>>>>>>>>>>>>>>>> hand,
> >>>>>>>>>>>>>>>>>>>>>>>>>>> unlike
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> other hints which just
> >>>>>> affect
> >>>>>>>>> the
> >>>>>>>>>>> way to
> >>>>>>>>>>>>>>>>> execute the query,
> >>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>> property
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> table hint actually
> >>>>>> affects the
> >>>>>>>>>>> result of
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> query, we should
> >>>>>>>>>>>>>>>>>>>>>>>>> never
> >>>>>>>>>>>>>>>>>>>>>>>>>>> ignore
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the given property
> >>>>> hints.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For the format of
> >>>>>> property
> >>>>>>>>> hints,
> >>>>>>>>>>>>>>> currently,
> >>>>>>>>>>>>>>>>> in sql client, we
> >>>>>>>>>>>>>>>>>>>>>>>>>> accept
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> properties in format of
> >>>>>> string
> >>>>>>>>>>> only in
> >>>>>>>>>>>>>> DDL:
> >>>>>>>>>>>>>>>>>>>>>>>>>> 'connector.type'='kafka',
> >>>>>>>>>>>>>>>>>>>>>>>>>>> I
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> think the format of
> >>>>>> properties
> >>>>>>>>> in
> >>>>>>>>>>> hint
> >>>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>>> be the same as
> >>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>> format we
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> defined in ddl. What do
> >>>>>> you
> >>>>>>>>> think?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Bests,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Wenlong Lyu
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, 10 Mar 2020 at
> >>>>>> 14:22,
> >>>>>>>>>>> Danny Chan
> >>>>>>>>>>>>>> <
> >>>>>>>>>>>>>>>>>>>>>>> yuzhao....@gmail.com>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To Weike: About the
> >>>>>> Error
> >>>>>>>>> Handing
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To be consistent with
> >>>>>> other
> >>>>>>>>> SQL
> >>>>>>>>>>>>>> vendors,
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> default is to
> >>>>>>>>>>>>>>>>>>>>>>> log
> >>>>>>>>>>>>>>>>>>>>>>>>>>> warnings
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and if there is any
> >>>>>> error
> >>>>>>>>>>> (invalid hint
> >>>>>>>>>>>>>>>> name
> >>>>>>>>>>>>>>>>> or options), the
> >>>>>>>>>>>>>>>>>>>>>>>>> hint
> >>>>>>>>>>>>>>>>>>>>>>>>>>> is just
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ignored. I have
> >>>>> already
> >>>>>>>>>>> addressed in
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> wiki.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To Timo: About the
> >>>>>> PROPERTIES
> >>>>>>>>>>> Table
> >>>>>>>>>>>>>> Hint
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> • The properties
> >>>>> hints
> >>>>>> is
> >>>>>>>>> also
> >>>>>>>>>>>>>> optional,
> >>>>>>>>>>>>>>>>> user can pass in an
> >>>>>>>>>>>>>>>>>>>>>>>>> option
> >>>>>>>>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> override the table
> >>>>>> properties
> >>>>>>>>>>> but this
> >>>>>>>>>>>>>>> does
> >>>>>>>>>>>>>>>>> not mean it is
> >>>>>>>>>>>>>>>>>>>>>>>>>> required.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> • They should not
> >>>>>> include
> >>>>>>>>>>> semantics:
> >>>>>>>>>>>>>> does
> >>>>>>>>>>>>>>>>> the properties
> >>>>>>>>>>>>>>>>>>>>>>> belong
> >>>>>>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> semantic ? I don't
> >>>>>> think so,
> >>>>>>>>> the
> >>>>>>>>>>> plan
> >>>>>>>>>>>>>>> does
> >>>>>>>>>>>>>>>>> not change right ?
> >>>>>>>>>>>>>>>>>>>>>>>> The
> >>>>>>>>>>>>>>>>>>>>>>>>>>> result
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> set may be affected,
> >>>>>> but
> >>>>>>>>> there
> >>>>>>>>>>> are
> >>>>>>>>>>>>>>> already
> >>>>>>>>>>>>>>>>> some hints do so,
> >>>>>>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>>>>> example,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> MS-SQL MAXRECURSION
> >>>>> and
> >>>>>>>>> SNAPSHOT
> >>>>>>>>>>> hint
> >>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> • `SELECT * FROM
> >>>>> t(k=v,
> >>>>>>>>> k=v)`:
> >>>>>>>>>>> this
> >>>>>>>>>>>>>>> grammar
> >>>>>>>>>>>>>>>>> breaks the SQL
> >>>>>>>>>>>>>>>>>>>>>>>>> standard
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> compared to the hints
> >>>>>>>>> way(which
> >>>>>>>>>>> is
> >>>>>>>>>>>>>>> included
> >>>>>>>>>>>>>>>>> in comments)
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> • I actually didn't
> >>>>>> found any
> >>>>>>>>>>> vendors
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> support such
> >>>>>>>>>>>>>>>>>>>>>>> grammar,
> >>>>>>>>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>>>> there
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is no way to override
> >>>>>> table
> >>>>>>>>> level
> >>>>>>>>>>>>>>>> properties
> >>>>>>>>>>>>>>>>> dynamically. For
> >>>>>>>>>>>>>>>>>>>>>>>>>> normal
> >>>>>>>>>>>>>>>>>>>>>>>>>>> RDBMS,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think there are no
> >>>>>> requests
> >>>>>>>>>>> for such
> >>>>>>>>>>>>>>>>> dynamic parameters
> >>>>>>>>>>>>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>>>>>>>>>>>> all the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> table have the same
> >>>>>> storage
> >>>>>>>>> and
> >>>>>>>>>>>>>>> computation
> >>>>>>>>>>>>>>>>> and they are
> >>>>>>>>>>>>>>>>>>>>>>> almost
> >>>>>>>>>>>>>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>>>>>>>>>>>>>> batch
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> tables.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> • While Flink as a
> >>>>>>>>> computation
> >>>>>>>>>>> engine
> >>>>>>>>>>>>>> has
> >>>>>>>>>>>>>>>>> many connectors,
> >>>>>>>>>>>>>>>>>>>>>>>>>>> especially for
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> some message queue
> >>>>> like
> >>>>>>>>> Kafka,
> >>>>>>>>>>> we would
> >>>>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>>> a start_offset
> >>>>>>>>>>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> different each time
> >>>>> we
> >>>>>> start
> >>>>>>>>> the
> >>>>>>>>>>> query,
> >>>>>>>>>>>>>>>> such
> >>>>>>>>>>>>>>>>> parameters can
> >>>>>>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> persisted to catalog,
> >>>>>> because
> >>>>>>>>>>> it’s not
> >>>>>>>>>>>>>>>>> static, this is
> >>>>>>>>>>>>>>>>>>>>>>> actually
> >>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> background we propose
> >>>>>> the
> >>>>>>>>> table
> >>>>>>>>>>> hints
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> indicate such
> >>>>>>>>>>>>>>>>>>>>>>>> properties
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dynamically.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> To Jark and Jinsong:
> >>>>> I
> >>>>>> have
> >>>>>>>>>>> removed the
> >>>>>>>>>>>>>>>>> query hints part and
> >>>>>>>>>>>>>>>>>>>>>>>>> change
> >>>>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> title.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-query?view=sql-server-ver15
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny Chan
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 在 2020年3月9日 +0800
> >>>>>> PM5:46,Timo
> >>>>>>>>>>> Walther <
> >>>>>>>>>>>>>>>>> twal...@apache.org
> >>>>>>>>>>>>>>>>>>>>>>>> ,写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danny,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> thanks for the
> >>>>>> proposal. I
> >>>>>>>>>>> agree with
> >>>>>>>>>>>>>>>> Jark
> >>>>>>>>>>>>>>>>> and Jingsong.
> >>>>>>>>>>>>>>>>>>>>>>>> Planner
> >>>>>>>>>>>>>>>>>>>>>>>>>>> hints
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and table hints are
> >>>>>>>>> orthogonal
> >>>>>>>>>>> topics
> >>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>> should be
> >>>>>>>>>>>>>>>>>>>>>>> discussed
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> separately.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I share Jingsong's
> >>>>>> opinion
> >>>>>>>>>>> that we
> >>>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>>> not use planner
> >>>>>>>>>>>>>>>>>>>>>>>> hints
> >>>>>>>>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> passing connector
> >>>>>>>>> properties.
> >>>>>>>>>>> Planner
> >>>>>>>>>>>>>>>>> hints should be
> >>>>>>>>>>>>>>>>>>>>>>> optional
> >>>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>>>> any
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> time. They should
> >>>>> not
> >>>>>>>>> include
> >>>>>>>>>>>>>> semantics
> >>>>>>>>>>>>>>>>> but only affect
> >>>>>>>>>>>>>>>>>>>>>>>>> execution
> >>>>>>>>>>>>>>>>>>>>>>>>>>> time.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Connector
> >>>>> properties
> >>>>>> are an
> >>>>>>>>>>> important
> >>>>>>>>>>>>>>>> part
> >>>>>>>>>>>>>>>>> of the query
> >>>>>>>>>>>>>>>>>>>>>>>> itself.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Have you thought
> >>>>>> about
> >>>>>>>>> options
> >>>>>>>>>>> such
> >>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>> `SELECT * FROM t(k=v,
> >>>>>>>>>>>>>>>>>>>>>>>>>> k=v)`?
> >>>>>>>>>>>>>>>>>>>>>>>>>>> How
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> are other vendors
> >>>>>> deal with
> >>>>>>>>>>> this
> >>>>>>>>>>>>>>> problem?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Timo
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On 09.03.20 10:37,
> >>>>>>>>> Jingsong Li
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danny, +1 for
> >>>>>> table
> >>>>>>>>> hints,
> >>>>>>>>>>>>>> thanks
> >>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>> driving.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I took a look to
> >>>>>> FLIP,
> >>>>>>>>> most
> >>>>>>>>>>> of
> >>>>>>>>>>>>>>> content
> >>>>>>>>>>>>>>>>> are talking about
> >>>>>>>>>>>>>>>>>>>>>>>> query
> >>>>>>>>>>>>>>>>>>>>>>>>>>> hints.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It is
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hard to
> >>>>> discussion
> >>>>>> and
> >>>>>>>>>>> voting. So
> >>>>>>>>>>>>>> +1
> >>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>> split it as Jark
> >>>>>>>>>>>>>>>>>>>>>>>> said.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Another thing is
> >>>>>>>>>>> configuration that
> >>>>>>>>>>>>>>>>> suitable to config with
> >>>>>>>>>>>>>>>>>>>>>>>>> table
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hints:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "connector.path"
> >>>>>> and
> >>>>>>>>>>>>>>> "connector.topic",
> >>>>>>>>>>>>>>>>> Are they really
> >>>>>>>>>>>>>>>>>>>>>>>>> suitable
> >>>>>>>>>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> table
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hints? Looks
> >>>>> weird
> >>>>>> to me.
> >>>>>>>>>>> Because I
> >>>>>>>>>>>>>>>>> think these properties
> >>>>>>>>>>>>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> core of
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> table.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jingsong Lee
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Mar 9,
> >>>>>> 2020 at
> >>>>>>>>> 5:30
> >>>>>>>>>>> PM Jark
> >>>>>>>>>>>>>>> Wu
> >>>>>>>>>>>>>>>> <
> >>>>>>>>>>>>>>>>> imj...@gmail.com>
> >>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Danny
> >>>>> for
> >>>>>>>>> starting
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> discussion.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> +1 for this
> >>>>>> feature.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we just
> >>>>> focus
> >>>>>> on the
> >>>>>>>>>>> table
> >>>>>>>>>>>>>> hints
> >>>>>>>>>>>>>>>>> not the query hints in
> >>>>>>>>>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> release,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> could you split
> >>>>>> the
> >>>>>>>>> FLIP
> >>>>>>>>>>> into two
> >>>>>>>>>>>>>>>>> FLIPs?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Because it's
> >>>>>> hard to
> >>>>>>>>> vote
> >>>>>>>>>>> on
> >>>>>>>>>>>>>>> partial
> >>>>>>>>>>>>>>>>> part of a FLIP. You
> >>>>>>>>>>>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>>>>>>>>> keep
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the table
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hints proposal
> >>>>> in
> >>>>>>>>> FLIP-113
> >>>>>>>>>>> and
> >>>>>>>>>>>>>> move
> >>>>>>>>>>>>>>>>> query hints into
> >>>>>>>>>>>>>>>>>>>>>>> another
> >>>>>>>>>>>>>>>>>>>>>>>>>> FLIP.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So that we can
> >>>>>> focuse
> >>>>>>>>> on
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> table
> >>>>>>>>>>>>>>>>> hints in the FLIP.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jark
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, 9 Mar
> >>>>>> 2020 at
> >>>>>>>>>>> 17:14,
> >>>>>>>>>>>>>> DONG,
> >>>>>>>>>>>>>>>>> Weike <
> >>>>>>>>>>>>>>>>>>>>>>>>>> kyled...@connect.hku.hk
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Danny,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This is a
> >>>>> nice
> >>>>>>>>> feature,
> >>>>>>>>>>> +1.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> One thing I
> >>>>> am
> >>>>>>>>>>> interested in
> >>>>>>>>>>>>>> but
> >>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>> mentioned in the
> >>>>>>>>>>>>>>>>>>>>>>>>> proposal
> >>>>>>>>>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> error
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> handling, as
> >>>>>> it is
> >>>>>>>>> quite
> >>>>>>>>>>> common
> >>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>> users to write
> >>>>>>>>>>>>>>>>>>>>>>>>>> inappropriate
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hints in
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SQL code, if
> >>>>>> illegal
> >>>>>>>>> or
> >>>>>>>>>>> "bad"
> >>>>>>>>>>>>>>> hints
> >>>>>>>>>>>>>>>>> are given, would the
> >>>>>>>>>>>>>>>>>>>>>>>>> system
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> simply
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ignore them
> >>>>> or
> >>>>>> throw
> >>>>>>>>>>>>>> exceptions?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks : )
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Weike
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Mon, Mar
> >>>>> 9,
> >>>>>> 2020
> >>>>>>>>> at
> >>>>>>>>>>> 5:02 PM
> >>>>>>>>>>>>>>>> Danny
> >>>>>>>>>>>>>>>>> Chan <
> >>>>>>>>>>>>>>>>>>>>>>>>>> yuzhao....@gmail.com>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Note:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> we only
> >>>>> plan
> >>>>>> to
> >>>>>>>>>>> support table
> >>>>>>>>>>>>>>>>> hints in Flink release
> >>>>>>>>>>>>>>>>>>>>>>> 1.11,
> >>>>>>>>>>>>>>>>>>>>>>>>> so
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> please
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> focus
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> mainly on
> >>>>>> the table
> >>>>>>>>>>> hints
> >>>>>>>>>>>>>> part
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>> just ignore the
> >>>>>>>>>>>>>>>>>>>>>>> planner
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> hints, sorry
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that
> >>>>> mistake
> >>>>>> ~
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny Chan
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 在 2020年3月9日
> >>>>>> +0800
> >>>>>>>>>>>>>> PM4:36,Danny
> >>>>>>>>>>>>>>>>> Chan <
> >>>>>>>>>>>>>>>>>>>>>>> yuzhao....@gmail.com
> >>>>>>>>>>>>>>>>>>>>>>>>>>> ,写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi,
> >>>>>> fellows ~
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would
> >>>>>> like to
> >>>>>>>>>>> propose the
> >>>>>>>>>>>>>>>>> supports for SQL hints for
> >>>>>>>>>>>>>>>>>>>>>>>> our
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Flink SQL.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We would
> >>>>>> support
> >>>>>>>>>>> hints
> >>>>>>>>>>>>>> syntax
> >>>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>> following:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> select
> >>>>> /*+
> >>>>>>>>>>> NO_HASH_JOIN,
> >>>>>>>>>>>>>>>>> RESOURCE(mem='128mb',
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> parallelism='24') */
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> from
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> emp /*+
> >>>>>>>>> INDEX(idx1,
> >>>>>>>>>>> idx2)
> >>>>>>>>>>>>>> */
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> join
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> dept /*+
> >>>>>>>>>>>>>> PROPERTIES(k1='v1',
> >>>>>>>>>>>>>>>>> k2='v2') */
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> emp.deptno
> >>>>>> =
> >>>>>>>>>>> dept.deptno
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Basically
> >>>>>> we
> >>>>>>>>> would
> >>>>>>>>>>> support
> >>>>>>>>>>>>>>> both
> >>>>>>>>>>>>>>>>> query hints(after the
> >>>>>>>>>>>>>>>>>>>>>>>>> SELECT
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> keyword)
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and table
> >>>>>>>>> hints(after
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>> referenced table name), for
> >>>>>>>>>>>>>>>>>>>>>>>> 1.11,
> >>>>>>>>>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> plan to
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> only
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> support
> >>>>>> table hints
> >>>>>>>>>>> with a
> >>>>>>>>>>>>>> hint
> >>>>>>>>>>>>>>>>> probably named
> >>>>>>>>>>>>>>>>>>>>>>> PROPERTIES:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> table_name
> >>>>>> /*+
> >>>>>>>>>>>>>>>>> PROPERTIES(k1='v1', k2='v2') *+/
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I am
> >>>>>> looking
> >>>>>>>>> forward
> >>>>>>>>>>> to
> >>>>>>>>>>>>>> your
> >>>>>>>>>>>>>>>>> comments.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> You can
> >>>>>> access
> >>>>>>>>> the
> >>>>>>>>>>> FLIP
> >>>>>>>>>>>>>> here:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+SQL+and+Planner+Hints
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Danny
> >>>>> Chan
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
>
>

Reply via email to