@Danny sounds good.

Maybe it is worth listing all the classes of problems that you want to
address and then look at each class and see if hints are a good default
solution or a good optional way of simplifying things?
The discussion has grown a lot and it is starting to be hard to distinguish
the parts where everyone agrees from the parts were there are concerns.

On Thu, Mar 12, 2020 at 2:31 PM Danny Chan <danny0...@apache.org> wrote:

> Thanks Stephan ~
>
> We can remove the support for properties that may change the semantics of
> query if you think that is a trouble.
>
> How about we support the /*+ properties() */ hint only for those optimize
> parameters, such as the fetch size of source or something like that, does
> that make sense?
>
> Stephan Ewen <se...@apache.org>于2020年3月12日 周四下午7:45写道:
>
> > I think Bowen has actually put it very well.
> >
> > (1) Hints that change semantics looks like trouble waiting to happen. For
> > example Kafka offset handling should be in filters. The Kafka source
> should
> > support predicate pushdown.
> >
> > (2) Hints should not be a workaround for current shortcomings. A lot of
> the
> > suggested above sounds exactly like that. Working around catalog/DDL
> > shortcomings, missing exposure of metadata (offsets), missing predicate
> > pushdown in Kafka. Abusing a feature like hints now as a quick fix for
> > these issues, rather than fixing the root causes, will much likely bite
> us
> > back badly in the future.
> >
> > Best,
> > Stephan
> >
> >
> > On Thu, Mar 12, 2020 at 10:43 AM Kurt Young <ykt...@gmail.com> wrote:
> >
> > > It seems this FLIP's name is somewhat misleading. From my
> understanding,
> > > this FLIP is trying to
> > > address the dynamic parameter issue, and table hints is the way we wan
> to
> > > choose. I think we should
> > > be focus on "what's the right way to solve dynamic property" instead of
> > > discussing "whether table
> > > hints can affect query semantics".
> > >
> > > For now, there are two proposed ways to achieve dynamic property:
> > > 1. FLIP-110: create temporary table xx like xx with (xxx)
> > > 2. use custom "from t with (xxx)" syntax
> > > 3. "Borrow" the table hints to have a special PROPERTIES hint.
> > >
> > > The first one didn't break anything, but the only problem i see is a
> > little
> > > more verbose than the table hint
> > > approach. I can imagine when someone using SQL CLI to have a sql
> > > experience, it's quite often that
> > > he will modify the table property, some use cases i can think of:
> > > 1. the source contains some corrupted data, i want to turn on the
> > > "ignore-error" flag for certain formats.
> > > 2. I have a kafka table and want to see some sample data from the
> > > beginning, so i change the offset
> > > to "earliest", and then I want to observe the latest data which keeps
> > > coming in. I would write another query
> > > to select from the latest table.
> > > 3. I want to my jdbc sink flush data more eagerly then i can observe
> the
> > > data from database side.
> > >
> > > Most of such use cases are quite ad-hoc. If every time I want to have a
> > > different experience, i need to create
> > > a temporary table and then also modify my query, it doesn't feel
> smooth.
> > > Embed such dynamic property into
> > > query would have better user experience.
> > >
> > > Both 2 & 3 can make this happen. The cons of #2 is breaking SQL
> > compliant,
> > > and for #3, it only breaks some
> > > unwritten rules, but we can have an explanation on that. And I really
> > doubt
> > > whether user would complain about
> > > this when they actually have flexible and good experience using this.
> > >
> > > My tendency would be #3 > #1 > #2, what do you think?
> > >
> > > Best,
> > > Kurt
> > >
> > >
> > > On Thu, Mar 12, 2020 at 1:11 PM Danny Chan <yuzhao....@gmail.com>
> wrote:
> > >
> > > > Thanks Aljoscha ~
> > > >
> > > > I agree for most of the query hints, they are optional as an
> optimizer
> > > > instruction, especially for the traditional RDBMS.
> > > >
> > > > But, just like BenChao said, Flink as a computation engine has many
> > > > different kind of data sources, thus, dynamic parameters like
> > > start_offest
> > > > can only bind to each table scope, we can not set a session config
> like
> > > > KSQL because they are all about Kafka:
> > > > > SET ‘auto.offset.reset’=‘earliest’;
> > > >
> > > > Thus the most flexible way to set up these dynamic params is to bind
> to
> > > > the table scope in the query when we want to override something, so
> we
> > > have
> > > > these solutions above (with pros and cons from my side):
> > > >
> > > > • 1. Select * from t(offset=123) (from Timo)
> > > >
> > > >            Pros:
> > > >              - Easy to add
> > > >              - Parameters are part of the main query
> > > >            Cons:
> > > >              - Not SQL compliant
> > > >
> > > >
> > > > • 2. Select * from t /*+ PROPERTIES(offset=123) */ (from me)
> > > >
> > > >            Pros:
> > > >            - Easy to add
> > > >            - SQL compliant because it is nested in the comments
> > > >
> > > >            Cons:
> > > >            - Parameters are not part of the main query
> > > >            - Cryptic syntax for new users
> > > >
> > > > The biggest problem for hints way may be the “if hints must be
> > optional”,
> > > > actually we have though about 1 for a while but aborted because it
> > breaks
> > > > the SQL standard too much. And we replace it with 2, because the
> hints
> > > > syntax do not break SQL standard(nested in comments).
> > > >
> > > > What if we have the special /*+ PROPERTIES */ hint that allows
> override
> > > > some properties of table dynamically, it does not break anything, at
> > > lease
> > > > for current Flink use cases.
> > > >
> > > > Planner hints are optional just because they are naturally enforcers
> of
> > > > the planner, most of them aim to instruct the optimizer, but, the
> table
> > > > hints is a little different, table hints can specify the table meta
> > like
> > > > index column, and it is very convenient to specify table properties.
> > > >
> > > > Or shall we not call  /*+ PROPERTIES(offset=123) */ table hint, we
> can
> > > > call it table dynamic parameters.
> > > >
> > > > Best,
> > > > Danny Chan
> > > > 在 2020年3月11日 +0800 PM9:20,Aljoscha Krettek <aljos...@apache.org>,写道:
> > > > > Hi,
> > > > >
> > > > > I don't understand this discussion. Hints, as I understand them,
> > should
> > > > > work like this:
> > > > >
> > > > > - hints are *optional* advice for the optimizer to try and help it
> to
> > > > > find a good execution strategy
> > > > > - hints should not change query semantics, i.e. they should not
> > change
> > > > > connector properties executing a query with taking into account the
> > > > > hints *must* produce the same result as executing the query without
> > > > > taking into account the hints
> > > > >
> > > > > From these simple requirements you can derive a solution that makes
> > > > > sense. I don't have a strong preference for the syntax but we
> should
> > > > > strive to be in line with prior work.
> > > > >
> > > > > Best,
> > > > > Aljoscha
> > > > >
> > > > > On 11.03.20 11:53, Danny Chan wrote:
> > > > > > Thanks Timo for summarize the 3 options ~
> > > > > >
> > > > > > I agree with Kurt that option2 is too complicated to use because:
> > > > > >
> > > > > > • As a Kafka topic consumer, the user must define both the
> virtual
> > > > column for start offset and he must apply a special filter predicate
> > > after
> > > > each query
> > > > > > • And for the internal implementation, the metadata column push
> > down
> > > > is another hard topic, each kind of message queue may have its offset
> > > > attribute, we need to consider the expression type for different
> kind;
> > > the
> > > > source also need to recognize the constant column as a config
> > > option(which
> > > > is weird because usually what we pushed down is a table column)
> > > > > >
> > > > > > For option 1 and option3, I think there is no difference, option1
> > is
> > > > also a hint syntax which is introduced in Sybase and referenced then
> > > > deprecated by MS-SQL in 199X years because of the ambitiousness.
> > > Personally
> > > > I prefer /*+ */ style table hint than WITH keyword for these reasons:
> > > > > >
> > > > > > • We do not break the standard SQL, the hints are nested in SQL
> > > > comments
> > > > > > • We do not need to introduce additional WITH keyword which may
> > > appear
> > > > in a query if we use that because a table can be referenced in all
> > kinds
> > > of
> > > > SQL contexts: INSERT/DELETE/FROM/JOIN …. That would make our sql
> query
> > > > break too much of the SQL from standard
> > > > > > • We would have uniform syntax for hints as query hint, one
> syntax
> > > > fits all and more easy to use
> > > > > >
> > > > > >
> > > > > > And here is the reason why we choose a uniform Oracle style query
> > > > hint syntax which is addressed by Julian Hyde when we design the
> syntax
> > > > from the Calcite community:
> > > > > >
> > > > > > I don’t much like the MSSQL-style syntax for table hints. It
> adds a
> > > > new use of the WITH keyword that is unrelated to the use of WITH for
> > > > common-table expressions.
> > > > > >
> > > > > > A historical note. Microsoft SQL Server inherited its hint syntax
> > > from
> > > > Sybase a very long time ago. (See “Transact SQL Programming”[1], page
> > > 632,
> > > > “Optimizer hints”. The book was written in 1999, and covers Microsoft
> > SQL
> > > > Server 6.5 / 7.0 and Sybase Adaptive Server 11.5, but the syntax very
> > > > likely predates Sybase 4.3, from which Microsoft SQL Server was
> forked
> > in
> > > > 1993.)
> > > > > >
> > > > > > Microsoft later added the WITH keyword to make it less ambiguous,
> > and
> > > > has now deprecated the syntax that does not use WITH.
> > > > > >
> > > > > > They are forced to keep the syntax for backwards compatibility
> but
> > > > that doesn’t mean that we should shoulder their burden.
> > > > > >
> > > > > > I think formatted comments are the right container for hints
> > because
> > > > it allows us to change the hint syntax without changing the SQL
> parser,
> > > and
> > > > makes clear that we are at liberty to ignore hints entirely.
> > > > > >
> > > > > > Julian
> > > > > >
> > > > > > [1] https://www.amazon.com/s?k=9781565924017 <
> > > > https://www.amazon.com/s?k=9781565924017>
> > > > > >
> > > > > > Best,
> > > > > > Danny Chan
> > > > > > 在 2020年3月11日 +0800 PM4:03,Timo Walther <twal...@apache.org>,写道:
> > > > > > > Hi Danny,
> > > > > > >
> > > > > > > it is true that our DDL is not standard compliant by using the
> > WITH
> > > > > > > clause. Nevertheless, we aim for not diverging too much and the
> > > LIKE
> > > > > > > clause is an example of that. It will solve things like
> > overwriting
> > > > > > > WATERMARKs, add additional/modifying properties and inherit
> > schema.
> > > > > > >
> > > > > > > Bowen is right that Flink's DDL is mixing 3 types definition
> > > > together.
> > > > > > > We are not the first ones that try to solve this. There is also
> > the
> > > > SQL
> > > > > > > MED standard [1] that tried to tackle this problem. I think it
> > was
> > > > not
> > > > > > > considered when designing the current DDL.
> > > > > > >
> > > > > > > Currently, I see 3 options for handling Kafka offsets. I will
> > give
> > > > some
> > > > > > > examples and look forward to feedback here:
> > > > > > >
> > > > > > > *Option 1* Runtime and semantic parms as part of the query
> > > > > > >
> > > > > > > `SELECT * FROM MyTable('offset'=123)`
> > > > > > >
> > > > > > > Pros:
> > > > > > > - Easy to add
> > > > > > > - Parameters are part of the main query
> > > > > > > - No complicated hinting syntax
> > > > > > >
> > > > > > > Cons:
> > > > > > > - Not SQL compliant
> > > > > > >
> > > > > > > *Option 2* Use metadata in query
> > > > > > >
> > > > > > > `CREATE TABLE MyTable (id INT, offset AS
> > > SYSTEM_METADATA('offset'))`
> > > > > > >
> > > > > > > `SELECT * FROM MyTable WHERE offset > TIMESTAMP '2012-12-12
> > > > 12:34:22'`
> > > > > > >
> > > > > > > Pros:
> > > > > > > - SQL compliant in the query
> > > > > > > - Access of metadata in the DDL which is required anyway
> > > > > > > - Regular pushdown rules apply
> > > > > > >
> > > > > > > Cons:
> > > > > > > - Users need to add an additional comlumn in the DDL
> > > > > > >
> > > > > > > *Option 3*: Use hints for properties
> > > > > > >
> > > > > > > `
> > > > > > > SELECT *
> > > > > > > FROM MyTable /*+ PROPERTIES('offset'=123) */
> > > > > > > `
> > > > > > >
> > > > > > > Pros:
> > > > > > > - Easy to add
> > > > > > >
> > > > > > > Cons:
> > > > > > > - Parameters are not part of the main query
> > > > > > > - Cryptic syntax for new users
> > > > > > > - Not standard compliant.
> > > > > > >
> > > > > > > If we go with this option, I would suggest to make it available
> > in
> > > a
> > > > > > > separate map and don't mix it with statically defined
> properties.
> > > > Such
> > > > > > > that the factory can decide which properties have the right to
> be
> > > > > > > overwritten by the hints:
> > > > > > > TableSourceFactory.Context.getQueryHints(): ReadableConfig
> > > > > > >
> > > > > > > Regards,
> > > > > > > Timo
> > > > > > >
> > > > > > > [1] https://en.wikipedia.org/wiki/SQL/MED
> > > > > > >
> > > > > > > Currently I see 3 options as a
> > > > > > >
> > > > > > >
> > > > > > > On 11.03.20 07:21, Danny Chan wrote:
> > > > > > > > Thanks Bowen ~
> > > > > > > >
> > > > > > > > I agree we should somehow categorize our connector
> parameters.
> > > > > > > >
> > > > > > > > For type1, I’m already preparing a solution like the
> Confluent
> > > > schema registry + Avro schema inference thing, so this may not be a
> > > problem
> > > > in the near future.
> > > > > > > >
> > > > > > > > For type3, I have some questions:
> > > > > > > >
> > > > > > > > > "SELECT * FROM mykafka WHERE offset > 12pm yesterday”
> > > > > > > >
> > > > > > > > Where does the offset column come from, a virtual column from
> > the
> > > > table schema, you said that
> > > > > > > >
> > > > > > > > > They change
> > > > > > > > almost every time a query starts and have nothing to do with
> > > > metadata, thus
> > > > > > > > should not be part of table definition/DDL
> > > > > > > >
> > > > > > > > But why you can reference it in the query, I’m confused for
> > that,
> > > > can you elaborate a little ?
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Danny Chan
> > > > > > > > 在 2020年3月11日 +0800 PM12:52,Bowen Li <bowenl...@gmail.com
> >,写道:
> > > > > > > > > Thanks Danny for kicking off the effort
> > > > > > > > >
> > > > > > > > > The root cause of too much manual work is Flink DDL has
> > mixed 3
> > > > types of
> > > > > > > > > params together and doesn't handle each of them very well.
> > > Below
> > > > are how I
> > > > > > > > > categorize them and corresponding solutions in my mind:
> > > > > > > > >
> > > > > > > > > - type 1: Metadata of external data, like external
> > > endpoint/url,
> > > > > > > > > username/pwd, schemas, formats.
> > > > > > > > >
> > > > > > > > > Such metadata are mostly already accessible in external
> > system
> > > > as long as
> > > > > > > > > endpoints and credentials are provided. Flink can get it
> thru
> > > > catalogs, but
> > > > > > > > > we haven't had many catalogs yet and thus Flink just hasn't
> > > been
> > > > able to
> > > > > > > > > leverage that. So the solution should be building more
> > > catalogs.
> > > > Such
> > > > > > > > > params should be part of a Flink table DDL/definition, and
> > not
> > > > overridable
> > > > > > > > > in any means.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > - type 2: Runtime params, like jdbc connector's fetch size,
> > > > elasticsearch
> > > > > > > > > connector's bulk flush size.
> > > > > > > > >
> > > > > > > > > Such params don't affect query results, but affect how
> > results
> > > > are produced
> > > > > > > > > (eg. fast or slow, aka performance) - they are essentially
> > > > execution and
> > > > > > > > > implementation details. They change often in exploration or
> > > > development
> > > > > > > > > stages, but not quite frequently in well-defined
> long-running
> > > > pipelines.
> > > > > > > > > They should always have default values and can be missing
> in
> > > > query. They
> > > > > > > > > can be part of a table DDL/definition, but should also be
> > > > replaceable in a
> > > > > > > > > query - *this is what table "hints" in FLIP-113 should
> > cover*.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > - type 3: Semantic params, like kafka connector's start
> > offset.
> > > > > > > > >
> > > > > > > > > Such params affect query results - the semantics. They'd
> > better
> > > > be as
> > > > > > > > > filter conditions in WHERE clause that can be pushed down.
> > They
> > > > change
> > > > > > > > > almost every time a query starts and have nothing to do
> with
> > > > metadata, thus
> > > > > > > > > should not be part of table definition/DDL, nor be
> persisted
> > in
> > > > catalogs.
> > > > > > > > > If they will, users should create views to keep such params
> > > > around (note
> > > > > > > > > this is different from variable substitution).
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Take Flink-Kafka as an example. Once we get these params
> > right,
> > > > here're the
> > > > > > > > > steps users need to do to develop and run a Flink job:
> > > > > > > > > - configure a Flink ConfluentSchemaRegistry with url,
> > username,
> > > > and password
> > > > > > > > > - run "SELECT * FROM mykafka WHERE offset > 12pm yesterday"
> > > > (simplified
> > > > > > > > > timestamp) in SQL CLI, Flink automatically retrieves all
> > > > metadata of
> > > > > > > > > schema, file format, etc and start the job
> > > > > > > > > - users want to make the job read Kafka topic faster, so it
> > > goes
> > > > as "SELECT
> > > > > > > > > * FROM mykafka /* faster_read_key=value*/ WHERE offset >
> 12pm
> > > > yesterday"
> > > > > > > > > - done and satisfied, users submit it to production
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Regarding "CREATE TABLE t LIKE with (k1=v1, k2=v2), I think
> > > it's
> > > > a
> > > > > > > > > nice-to-have feature, but not a strategically critical,
> > > > long-term solution,
> > > > > > > > > because
> > > > > > > > > 1) It may seem promising at the current stage to solve the
> > > > > > > > > too-much-manual-work problem, but that's only because Flink
> > > > hasn't
> > > > > > > > > leveraged catalogs well and handled the 3 types of params
> > above
> > > > properly.
> > > > > > > > > Once we get the params types right, the LIKE syntax won't
> be
> > > that
> > > > > > > > > important, and will be just an easier way to create tables
> > > > without retyping
> > > > > > > > > long fields like username and pwd.
> > > > > > > > > 2) Note that only some rare type of catalog can store k-v
> > > > property pair, so
> > > > > > > > > table created this way often cannot be persisted. In the
> > > > foreseeable
> > > > > > > > > future, such catalog will only be HiveCatalog, and not
> > everyone
> > > > has a Hive
> > > > > > > > > metastore. To be honest, without persistence, recreating
> > tables
> > > > every time
> > > > > > > > > this way is still a lot of keyboard typing.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Bowen
> > > > > > > > >
> > > > > > > > > On Tue, Mar 10, 2020 at 8:07 PM Kurt Young <
> ykt...@gmail.com
> > >
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > If a specific connector want to have such parameter and
> > read
> > > > if out of
> > > > > > > > > > configuration, then that's fine.
> > > > > > > > > > If we are talking about a configuration for all kinds of
> > > > sources, I would
> > > > > > > > > > be super careful about that.
> > > > > > > > > > It's true it can solve maybe 80% cases, but it will also
> > make
> > > > the left 20%
> > > > > > > > > > feels weird.
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Kurt
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Mar 11, 2020 at 11:00 AM Jark Wu <
> imj...@gmail.com
> > >
> > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Kurt,
> > > > > > > > > > >
> > > > > > > > > > > #3 Regarding to global offset:
> > > > > > > > > > > I'm not saying to use the global configuration to
> > override
> > > > connector
> > > > > > > > > > > properties by the planner.
> > > > > > > > > > > But the connector should take this configuration and
> > > > translate into their
> > > > > > > > > > > client API.
> > > > > > > > > > > AFAIK, almost all the message queues support eariliest
> > and
> > > > latest and a
> > > > > > > > > > > timestamp value as start point.
> > > > > > > > > > > So we can support 3 options for this configuration:
> > > > "eariliest", "latest"
> > > > > > > > > > > and a timestamp string value.
> > > > > > > > > > > Of course, this can't solve 100% cases, but I guess can
> > > > sovle 80% or 90%
> > > > > > > > > > > cases.
> > > > > > > > > > > And the remaining cases can be resolved by LIKE syntax
> > > which
> > > > I guess is
> > > > > > > > > > not
> > > > > > > > > > > very common cases.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Jark
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, 11 Mar 2020 at 10:33, Kurt Young <
> > ykt...@gmail.com
> > > >
> > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Good to have such lovely discussions. I also want to
> > > share
> > > > some of my
> > > > > > > > > > > > opinions.
> > > > > > > > > > > >
> > > > > > > > > > > > #1 Regarding to error handling: I also think ignore
> > > > invalid hints would
> > > > > > > > > > > be
> > > > > > > > > > > > dangerous, maybe
> > > > > > > > > > > > the simplest solution is just throw an exception.
> > > > > > > > > > > >
> > > > > > > > > > > > #2 Regarding to property replacement: I don't think
> we
> > > > should
> > > > > > > > > > constraint
> > > > > > > > > > > > ourself to
> > > > > > > > > > > > the meaning of the word "hint", and forbidden it
> > > modifying
> > > > any
> > > > > > > > > > properties
> > > > > > > > > > > > which can effect
> > > > > > > > > > > > query results. IMO `PROPERTIES` is one of the table
> > > hints,
> > > > and a
> > > > > > > > > > powerful
> > > > > > > > > > > > one. It can
> > > > > > > > > > > > modify properties located in DDL's WITH block. But I
> > also
> > > > see the harm
> > > > > > > > > > > that
> > > > > > > > > > > > if we make it
> > > > > > > > > > > > too flexible like change the kafka topic name with a
> > > hint.
> > > > Such use
> > > > > > > > > > case
> > > > > > > > > > > is
> > > > > > > > > > > > not common and
> > > > > > > > > > > > sounds very dangerous to me. I would propose we have
> a
> > > map
> > > > of hintable
> > > > > > > > > > > > properties for each
> > > > > > > > > > > > connector, and should validate all passed in
> properties
> > > > are actually
> > > > > > > > > > > > hintable. And combining with
> > > > > > > > > > > > #1 error handling, we can throw an exception once
> > > received
> > > > invalid
> > > > > > > > > > > > property.
> > > > > > > > > > > >
> > > > > > > > > > > > #3 Regarding to global offset: I'm not sure it's
> > > feasible.
> > > > Different
> > > > > > > > > > > > connectors will have totally
> > > > > > > > > > > > different properties to represent offset, some might
> be
> > > > timestamps,
> > > > > > > > > > some
> > > > > > > > > > > > might be string literals
> > > > > > > > > > > > like "earliest", and others might be just integers.
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Kurt
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Mar 10, 2020 at 11:46 PM Jark Wu <
> > > imj...@gmail.com>
> > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I want to jump in the discussion about the "dynamic
> > > > start offset"
> > > > > > > > > > > > problem.
> > > > > > > > > > > > > First of all, I share the same concern with Timo
> and
> > > > Fabian, that the
> > > > > > > > > > > > > "start offset" affects the query semantics, i.e.
> the
> > > > query result.
> > > > > > > > > > > > > But "hints" is just used for optimization which
> > should
> > > > affect the
> > > > > > > > > > > result?
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think the "dynamic start offset" is an very
> > important
> > > > usability
> > > > > > > > > > > problem
> > > > > > > > > > > > > which will be faced by many streaming platforms.
> > > > > > > > > > > > > I also agree "CREATE TEMPORARY TABLE Temp (LIKE t)
> > WITH
> > > > > > > > > > > > > ('connector.startup-timestamp-millis' =
> > > > '1578538374471')" is verbose,
> > > > > > > > > > > > what
> > > > > > > > > > > > > if we have 10 tables to join?
> > > > > > > > > > > > >
> > > > > > > > > > > > > However, what I want to propose (should be another
> > > > thread) is a
> > > > > > > > > > global
> > > > > > > > > > > > > configuration to reset start offsets of all the
> > source
> > > > connectors
> > > > > > > > > > > > > in the query session, e.g.
> > > "table.sources.start-offset".
> > > > This is
> > > > > > > > > > > possible
> > > > > > > > > > > > > now because `TableSourceFactory.Context` has
> > > > `getConfiguration`
> > > > > > > > > > > > > method to get the session configuration, and use it
> > to
> > > > create an
> > > > > > > > > > > adapted
> > > > > > > > > > > > > TableSource.
> > > > > > > > > > > > > Then we can also expose to SQL CLI via SET command,
> > > e.g.
> > > > `SET
> > > > > > > > > > > > > 'table.sources.start-offset'='earliest';`, which is
> > > > pretty simple and
> > > > > > > > > > > > > straightforward.
> > > > > > > > > > > > >
> > > > > > > > > > > > > This is very similar to KSQL's `SET
> > > > 'auto.offset.reset'='earliest'`
> > > > > > > > > > > which
> > > > > > > > > > > > > is very helpful IMO.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Jark
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, 10 Mar 2020 at 22:29, Timo Walther <
> > > > twal...@apache.org>
> > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Danny,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > compared to the hints, FLIP-110 is fully
> compliant
> > to
> > > > the SQL
> > > > > > > > > > > standard.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I don't think that `CREATE TEMPORARY TABLE Temp
> > (LIKE
> > > > t) WITH
> > > > > > > > > > (k=v)`
> > > > > > > > > > > is
> > > > > > > > > > > > > > too verbose or awkward for the power of basically
> > > > changing the
> > > > > > > > > > entire
> > > > > > > > > > > > > > connector. Usually, this statement would just
> > precede
> > > > the query in
> > > > > > > > > > a
> > > > > > > > > > > > > > multiline file. So it can be change "in-place"
> like
> > > > the hints you
> > > > > > > > > > > > > proposed.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Many companies have a well-defined set of tables
> > that
> > > > should be
> > > > > > > > > > used.
> > > > > > > > > > > > It
> > > > > > > > > > > > > > would be dangerous if users can change the path
> or
> > > > topic in a hint.
> > > > > > > > > > > The
> > > > > > > > > > > > > > catalog/catalog manager should be the entity that
> > > > controls which
> > > > > > > > > > > tables
> > > > > > > > > > > > > > exist and how they can be accessed.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > what’s the problem there if we user the table
> > hints
> > > > to support
> > > > > > > > > > > > “start
> > > > > > > > > > > > > > offset”?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > IMHO it violates the meaning of a hint. According
> > to
> > > > the
> > > > > > > > > > dictionary,
> > > > > > > > > > > a
> > > > > > > > > > > > > > hint is "a statement that expresses indirectly
> what
> > > > one prefers not
> > > > > > > > > > > to
> > > > > > > > > > > > > > say explicitly". But offsets are a property that
> > are
> > > > very explicit.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If we go with the hint approach, it should be
> > > > expressible in the
> > > > > > > > > > > > > > TableSourceFactory which properties are supported
> > for
> > > > hinting. Or
> > > > > > > > > > do
> > > > > > > > > > > > you
> > > > > > > > > > > > > > plan to offer those hints in a separate
> Map<String,
> > > > String> that
> > > > > > > > > > > cannot
> > > > > > > > > > > > > > overwrite existing properties? I think this would
> > be
> > > a
> > > > different
> > > > > > > > > > > > story...
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > Timo
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On 10.03.20 10:34, Danny Chan wrote:
> > > > > > > > > > > > > > > Thanks Timo ~
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Personally I would say that offset > 0 and
> start
> > > > offset = 10 does
> > > > > > > > > > > not
> > > > > > > > > > > > > > have the same semantic, so from the SQL aspect,
> we
> > > can
> > > > not
> > > > > > > > > > implement
> > > > > > > > > > > a
> > > > > > > > > > > > > > “starting offset” hint for query with such a
> > syntax.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > And the CREATE TABLE LIKE syntax is a DDL which
> > is
> > > > just verbose
> > > > > > > > > > for
> > > > > > > > > > > > > > defining such dynamic parameters even if it could
> > do
> > > > that, shall we
> > > > > > > > > > > > force
> > > > > > > > > > > > > > users to define a temporal table for each query
> > with
> > > > dynamic
> > > > > > > > > > params,
> > > > > > > > > > > I
> > > > > > > > > > > > > > would say it’s an awkward solution.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > "Hints should give "hints" but not affect the
> > > actual
> > > > produced
> > > > > > > > > > > > result.”
> > > > > > > > > > > > > > You mentioned that multiple times and could we
> > give a
> > > > reason,
> > > > > > > > > > what’s
> > > > > > > > > > > > the
> > > > > > > > > > > > > > problem there if we user the table hints to
> support
> > > > “start offset”
> > > > > > > > > > ?
> > > > > > > > > > > > From
> > > > > > > > > > > > > > my side I saw some benefits for that:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > • It’s very convent to set up these parameters,
> > the
> > > > syntax is
> > > > > > > > > > very
> > > > > > > > > > > > much
> > > > > > > > > > > > > > like the DDL definition
> > > > > > > > > > > > > > > • It’s scope is very clear, right on the table
> it
> > > > attathed
> > > > > > > > > > > > > > > • It does not affect the table schema, which
> > means
> > > > in order to
> > > > > > > > > > > > specify
> > > > > > > > > > > > > > the offset, there is no need to define an offset
> > > > column which is
> > > > > > > > > > > weird
> > > > > > > > > > > > > > actually, offset should never be a column, it’s
> > more
> > > > like a
> > > > > > > > > > metadata
> > > > > > > > > > > > or a
> > > > > > > > > > > > > > start option.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > So in total, FLIP-110 uses the offset more
> like a
> > > > Hive partition
> > > > > > > > > > > > prune,
> > > > > > > > > > > > > > we can do that if we have an offset column, but
> > most
> > > > of the case we
> > > > > > > > > > > do
> > > > > > > > > > > > > not
> > > > > > > > > > > > > > define that, so there is actually no conflict or
> > > > overlap.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Danny Chan
> > > > > > > > > > > > > > > 在 2020年3月10日 +0800 PM4:28,Timo Walther <
> > > > twal...@apache.org>,写道:
> > > > > > > > > > > > > > > > Hi Danny,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > shouldn't FLIP-110[1] solve most of the
> > problems
> > > > we have around
> > > > > > > > > > > > > defining
> > > > > > > > > > > > > > > > table properties more dynamically without
> > manual
> > > > schema work?
> > > > > > > > > > Also
> > > > > > > > > > > > > > > > offset definition is easier with such a
> syntax.
> > > > They must not be
> > > > > > > > > > > > > defined
> > > > > > > > > > > > > > > > in catalog but could be temporary tables that
> > > > extend from the
> > > > > > > > > > > > original
> > > > > > > > > > > > > > > > table.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > In general, we should aim to keep the syntax
> > > > concise and don't
> > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > too many ways of doing the same thing. Hints
> > > > should give "hints"
> > > > > > > > > > > but
> > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > affect the actual produced result.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Some connector properties might also change
> the
> > > > plan or schema
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > future. E.g. they might also define whether a
> > > > table source
> > > > > > > > > > > supports
> > > > > > > > > > > > > > > > certain push-downs (e.g. predicate
> push-down).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Dawid is currently working a draft that might
> > > > makes it possible
> > > > > > > > > > to
> > > > > > > > > > > > > > > > expose a Kafka offset via the schema such
> that
> > > > `SELECT * FROM
> > > > > > > > > > > Topic
> > > > > > > > > > > > > > > > WHERE offset > 10` would become possible and
> > > could
> > > > be pushed
> > > > > > > > > > down.
> > > > > > > > > > > > But
> > > > > > > > > > > > > > > > this is of course, not planned initially.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > Timo
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-110%3A+Support+LIKE+clause+in+CREATE+TABLE
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On 10.03.20 08:34, Danny Chan wrote:
> > > > > > > > > > > > > > > > > Thanks Wenlong ~
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > For PROPERTIES Hint Error handling
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Actually we have no way to figure out
> > whether a
> > > > error prone
> > > > > > > > > > hint
> > > > > > > > > > > > is a
> > > > > > > > > > > > > > PROPERTIES hint, for example, if use writes a
> hint
> > > like
> > > > > > > > > > ‘PROPERTIAS’,
> > > > > > > > > > > > we
> > > > > > > > > > > > > do
> > > > > > > > > > > > > > not know if this hint is a PROPERTIES hint, what
> we
> > > > know is that
> > > > > > > > > > the
> > > > > > > > > > > > hint
> > > > > > > > > > > > > > name was not registered in our Flink.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > If the user writes the hint name correctly
> > > (i.e.
> > > > PROPERTIES),
> > > > > > > > > > we
> > > > > > > > > > > > did
> > > > > > > > > > > > > > can enforce the validation of the hint options
> > though
> > > > the pluggable
> > > > > > > > > > > > > > HintOptionChecker.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > For PROPERTIES Hint Option Format
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > For a key value style hint option, the key
> > can
> > > > be either a
> > > > > > > > > > simple
> > > > > > > > > > > > > > identifier or a string literal, which means that
> > it’s
> > > > compatible
> > > > > > > > > > with
> > > > > > > > > > > > our
> > > > > > > > > > > > > > DDL syntax. We support simple identifier because
> > many
> > > > other hints
> > > > > > > > > > do
> > > > > > > > > > > > not
> > > > > > > > > > > > > > have the component complex keys like the table
> > > > properties, and we
> > > > > > > > > > > want
> > > > > > > > > > > > to
> > > > > > > > > > > > > > unify the parse block.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Danny Chan
> > > > > > > > > > > > > > > > > 在 2020年3月10日 +0800 PM3:19,wenlong.lwl <
> > > > wenlong88....@gmail.com
> > > > > > > > > > > > > ,写道:
> > > > > > > > > > > > > > > > > > Hi Danny, thanks for the proposal. +1 for
> > > > adding table hints,
> > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > > really
> > > > > > > > > > > > > > > > > > a necessary feature for flink sql to
> > > integrate
> > > > with a catalog.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > For error handling, I think it would be
> > more
> > > > natural to throw
> > > > > > > > > > an
> > > > > > > > > > > > > > > > > > exception when error table hint provided,
> > > > because the
> > > > > > > > > > properties
> > > > > > > > > > > > in
> > > > > > > > > > > > > > hint
> > > > > > > > > > > > > > > > > > will be merged and used to find the table
> > > > factory which would
> > > > > > > > > > > > cause
> > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > exception when error properties provided,
> > > > right? On the other
> > > > > > > > > > > > hand,
> > > > > > > > > > > > > > unlike
> > > > > > > > > > > > > > > > > > other hints which just affect the way to
> > > > execute the query,
> > > > > > > > > > the
> > > > > > > > > > > > > > property
> > > > > > > > > > > > > > > > > > table hint actually affects the result of
> > the
> > > > query, we should
> > > > > > > > > > > > never
> > > > > > > > > > > > > > ignore
> > > > > > > > > > > > > > > > > > the given property hints.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > For the format of property hints,
> > currently,
> > > > in sql client, we
> > > > > > > > > > > > > accept
> > > > > > > > > > > > > > > > > > properties in format of string only in
> DDL:
> > > > > > > > > > > > > 'connector.type'='kafka',
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > > think the format of properties in hint
> > should
> > > > be the same as
> > > > > > > > > > the
> > > > > > > > > > > > > > format we
> > > > > > > > > > > > > > > > > > defined in ddl. What do you think?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Bests,
> > > > > > > > > > > > > > > > > > Wenlong Lyu
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Tue, 10 Mar 2020 at 14:22, Danny Chan
> <
> > > > > > > > > > yuzhao....@gmail.com>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > To Weike: About the Error Handing
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > To be consistent with other SQL
> vendors,
> > > the
> > > > default is to
> > > > > > > > > > log
> > > > > > > > > > > > > > warnings
> > > > > > > > > > > > > > > > > > > and if there is any error (invalid hint
> > > name
> > > > or options), the
> > > > > > > > > > > > hint
> > > > > > > > > > > > > > is just
> > > > > > > > > > > > > > > > > > > ignored. I have already addressed in
> the
> > > > wiki.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > To Timo: About the PROPERTIES Table
> Hint
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > • The properties hints is also
> optional,
> > > > user can pass in an
> > > > > > > > > > > > option
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > override the table properties but this
> > does
> > > > not mean it is
> > > > > > > > > > > > > required.
> > > > > > > > > > > > > > > > > > > • They should not include semantics:
> does
> > > > the properties
> > > > > > > > > > belong
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > semantic ? I don't think so, the plan
> > does
> > > > not change right ?
> > > > > > > > > > > The
> > > > > > > > > > > > > > result
> > > > > > > > > > > > > > > > > > > set may be affected, but there are
> > already
> > > > some hints do so,
> > > > > > > > > > > for
> > > > > > > > > > > > > > example,
> > > > > > > > > > > > > > > > > > > MS-SQL MAXRECURSION and SNAPSHOT hint
> [1]
> > > > > > > > > > > > > > > > > > > • `SELECT * FROM t(k=v, k=v)`: this
> > grammar
> > > > breaks the SQL
> > > > > > > > > > > > standard
> > > > > > > > > > > > > > > > > > > compared to the hints way(which is
> > included
> > > > in comments)
> > > > > > > > > > > > > > > > > > > • I actually didn't found any vendors
> to
> > > > support such
> > > > > > > > > > grammar,
> > > > > > > > > > > > and
> > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > > > is no way to override table level
> > > properties
> > > > dynamically. For
> > > > > > > > > > > > > normal
> > > > > > > > > > > > > > RDBMS,
> > > > > > > > > > > > > > > > > > > I think there are no requests for such
> > > > dynamic parameters
> > > > > > > > > > > because
> > > > > > > > > > > > > > all the
> > > > > > > > > > > > > > > > > > > table have the same storage and
> > computation
> > > > and they are
> > > > > > > > > > almost
> > > > > > > > > > > > all
> > > > > > > > > > > > > > batch
> > > > > > > > > > > > > > > > > > > tables.
> > > > > > > > > > > > > > > > > > > • While Flink as a computation engine
> has
> > > > many connectors,
> > > > > > > > > > > > > > especially for
> > > > > > > > > > > > > > > > > > > some message queue like Kafka, we would
> > > have
> > > > a start_offset
> > > > > > > > > > > which
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > different each time we start the query,
> > > such
> > > > parameters can
> > > > > > > > > > not
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > persisted to catalog, because it’s not
> > > > static, this is
> > > > > > > > > > actually
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > background we propose the table hints
> to
> > > > indicate such
> > > > > > > > > > > properties
> > > > > > > > > > > > > > > > > > > dynamically.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > To Jark and Jinsong: I have removed the
> > > > query hints part and
> > > > > > > > > > > > change
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > title.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > >
> > >
> >
> https://docs.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-query?view=sql-server-ver15
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Danny Chan
> > > > > > > > > > > > > > > > > > > 在 2020年3月9日 +0800 PM5:46,Timo Walther <
> > > > twal...@apache.org
> > > > > > > > > > > ,写道:
> > > > > > > > > > > > > > > > > > > > Hi Danny,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > thanks for the proposal. I agree with
> > > Jark
> > > > and Jingsong.
> > > > > > > > > > > Planner
> > > > > > > > > > > > > > hints
> > > > > > > > > > > > > > > > > > > > and table hints are orthogonal topics
> > > that
> > > > should be
> > > > > > > > > > discussed
> > > > > > > > > > > > > > > > > > > separately.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I share Jingsong's opinion that we
> > should
> > > > not use planner
> > > > > > > > > > > hints
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > passing connector properties. Planner
> > > > hints should be
> > > > > > > > > > optional
> > > > > > > > > > > > at
> > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > > time. They should not include
> semantics
> > > > but only affect
> > > > > > > > > > > > execution
> > > > > > > > > > > > > > time.
> > > > > > > > > > > > > > > > > > > > Connector properties are an important
> > > part
> > > > of the query
> > > > > > > > > > > itself.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Have you thought about options such
> as
> > > > `SELECT * FROM t(k=v,
> > > > > > > > > > > > > k=v)`?
> > > > > > > > > > > > > > How
> > > > > > > > > > > > > > > > > > > > are other vendors deal with this
> > problem?
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > > > Timo
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On 09.03.20 10:37, Jingsong Li wrote:
> > > > > > > > > > > > > > > > > > > > > Hi Danny, +1 for table hints,
> thanks
> > > for
> > > > driving.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > I took a look to FLIP, most of
> > content
> > > > are talking about
> > > > > > > > > > > query
> > > > > > > > > > > > > > hints.
> > > > > > > > > > > > > > > > > > > It is
> > > > > > > > > > > > > > > > > > > > > hard to discussion and voting. So
> +1
> > to
> > > > split it as Jark
> > > > > > > > > > > said.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Another thing is configuration that
> > > > suitable to config with
> > > > > > > > > > > > table
> > > > > > > > > > > > > > > > > > > hints:
> > > > > > > > > > > > > > > > > > > > > "connector.path" and
> > "connector.topic",
> > > > Are they really
> > > > > > > > > > > > suitable
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > table
> > > > > > > > > > > > > > > > > > > > > hints? Looks weird to me. Because I
> > > > think these properties
> > > > > > > > > > > are
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > core of
> > > > > > > > > > > > > > > > > > > > > table.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > Jingsong Lee
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Mon, Mar 9, 2020 at 5:30 PM Jark
> > Wu
> > > <
> > > > imj...@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thanks Danny for starting the
> > > > discussion.
> > > > > > > > > > > > > > > > > > > > > > +1 for this feature.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > If we just focus on the table
> hints
> > > > not the query hints in
> > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > release,
> > > > > > > > > > > > > > > > > > > > > > could you split the FLIP into two
> > > > FLIPs?
> > > > > > > > > > > > > > > > > > > > > > Because it's hard to vote on
> > partial
> > > > part of a FLIP. You
> > > > > > > > > > can
> > > > > > > > > > > > > keep
> > > > > > > > > > > > > > > > > > > the table
> > > > > > > > > > > > > > > > > > > > > > hints proposal in FLIP-113 and
> move
> > > > query hints into
> > > > > > > > > > another
> > > > > > > > > > > > > FLIP.
> > > > > > > > > > > > > > > > > > > > > > So that we can focuse on the
> table
> > > > hints in the FLIP.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > > Jark
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Mon, 9 Mar 2020 at 17:14,
> DONG,
> > > > Weike <
> > > > > > > > > > > > > kyled...@connect.hku.hk
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Hi Danny,
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > This is a nice feature, +1.
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > One thing I am interested in
> but
> > > not
> > > > mentioned in the
> > > > > > > > > > > > proposal
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > error
> > > > > > > > > > > > > > > > > > > > > > > handling, as it is quite common
> > for
> > > > users to write
> > > > > > > > > > > > > inappropriate
> > > > > > > > > > > > > > > > > > > hints in
> > > > > > > > > > > > > > > > > > > > > > > SQL code, if illegal or "bad"
> > hints
> > > > are given, would the
> > > > > > > > > > > > system
> > > > > > > > > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > > > > > > > > ignore them or throw
> exceptions?
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Thanks : )
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > Weike
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > On Mon, Mar 9, 2020 at 5:02 PM
> > > Danny
> > > > Chan <
> > > > > > > > > > > > > yuzhao....@gmail.com>
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Note:
> > > > > > > > > > > > > > > > > > > > > > > > we only plan to support table
> > > > hints in Flink release
> > > > > > > > > > 1.11,
> > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > > please
> > > > > > > > > > > > > > > > > > > > > > > focus
> > > > > > > > > > > > > > > > > > > > > > > > mainly on the table hints
> part
> > > and
> > > > just ignore the
> > > > > > > > > > planner
> > > > > > > > > > > > > > > > > > > hints, sorry
> > > > > > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > > > > that mistake ~
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > Danny Chan
> > > > > > > > > > > > > > > > > > > > > > > > 在 2020年3月9日 +0800
> PM4:36,Danny
> > > > Chan <
> > > > > > > > > > yuzhao....@gmail.com
> > > > > > > > > > > > > > ,写道:
> > > > > > > > > > > > > > > > > > > > > > > > > Hi, fellows ~
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > I would like to propose the
> > > > supports for SQL hints for
> > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > Flink SQL.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > We would support hints
> syntax
> > > as
> > > > following:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > select /*+ NO_HASH_JOIN,
> > > > RESOURCE(mem='128mb',
> > > > > > > > > > > > > > > > > > > parallelism='24') */
> > > > > > > > > > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > > > > > > > emp /*+ INDEX(idx1, idx2)
> */
> > > > > > > > > > > > > > > > > > > > > > > > > join
> > > > > > > > > > > > > > > > > > > > > > > > > dept /*+
> PROPERTIES(k1='v1',
> > > > k2='v2') */
> > > > > > > > > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > > > > > emp.deptno = dept.deptno
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Basically we would support
> > both
> > > > query hints(after the
> > > > > > > > > > > > SELECT
> > > > > > > > > > > > > > > > > > > keyword)
> > > > > > > > > > > > > > > > > > > > > > > > and table hints(after the
> > > > referenced table name), for
> > > > > > > > > > > 1.11,
> > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > plan to
> > > > > > > > > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > > > > > > support table hints with a
> hint
> > > > probably named
> > > > > > > > > > PROPERTIES:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > table_name /*+
> > > > PROPERTIES(k1='v1', k2='v2') *+/
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > I am looking forward to
> your
> > > > comments.
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > You can access the FLIP
> here:
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-113%3A+SQL+and+Planner+Hints
> > > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > > > > > > > Danny Chan
> > > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to