Hi everyone ~
Thanks for the feedback, i would start a new vote again if there are no new
input objections after 24 hours ~
Best,
Danny
Jark Wu 于2020年4月8日周三 下午7:19写道:
> `table.dynamic-table-options.enabled` and `TableConfigOptions` sounds good
> to me.
>
> Best,
> Jark
>
> On Wed, 8 Apr 2020 a
`table.dynamic-table-options.enabled` and `TableConfigOptions` sounds good
to me.
Best,
Jark
On Wed, 8 Apr 2020 at 18:59, Danny Chan wrote:
> `table.dynamic-table-options.enabled` seems fine to me, I would make a new
> `TableConfigOptions` class and put the config option there ~
>
> What do you
`table.dynamic-table-options.enabled` seems fine to me, I would make a new
`TableConfigOptions` class and put the config option there ~
What do you think about the new class to put ?
Best,
Danny Chan
在 2020年4月8日 +0800 PM5:33,dev@flink.apache.org,写道:
>
> `table.dynamic-table-options.enabled`
On 08.04.20 04:27, Jark Wu wrote:
I have a minor concern about the global configuration
`table.optimizer.dynamic-table-options.enabled`, does it belong to
optimizer?
From my point of view, it is just an API to set table options and uses
Calcite in the implementation.
I'm also thinking about wha
+1, LGTM
Best,
Kurt
On Wed, Apr 8, 2020 at 10:27 AM Jark Wu wrote:
> Thanks for the summary Danny. +1 to the new proposal.
>
> I have a minor concern about the global configuration
> `table.optimizer.dynamic-table-options.enabled`, does it belong to
> optimizer?
> From my point of view, it is
Thanks for the summary Danny. +1 to the new proposal.
I have a minor concern about the global configuration
`table.optimizer.dynamic-table-options.enabled`, does it belong to
optimizer?
>From my point of view, it is just an API to set table options and uses
Calcite in the implementation.
I'm also
Thanks for the update Danny. +1 from my side.
Regards,
Timo
On 07.04.20 13:25, Danny Chan wrote:
Hi, every ~
It seems that we all agree to drop the idea for white/black list for each
connector, and have a global config option to default disable this feature.
I have also discussed with Timo a
Hi, every ~
It seems that we all agree to drop the idea for white/black list for each
connector, and have a global config option to default disable this feature.
I have also discussed with Timo and Jark about the interface
TableSourceTable.Context.getExecutionOptions and finally we decide to
intr
I'm fine to disable this feature by default and avoid
whitelisting/blacklisting. This simplifies a lot of things.
Regarding to TableSourceFactory#Context#getExecutionOptions, do we really
need this interface?
Should the connector factory be aware of the properties is merged with
hints or not?
What
Sounds like a reasonable compromise, disabling this feature by default is a
way to protect
the vulnerability, and we can simplify the design quite a lot. We can
gather some users'
feedback to see whether further protections are necessary in the future.
Best,
Kurt
On Mon, Apr 6, 2020 at 11:49 PM
I agree with Aljoscha. The length of this thread shows that this is
highly controversal. I think nobody really likes this feature 100% but
we could not find a better solution. I would consider it as a
nice-to-have improvement during a notebook/debugging session.
I would accept avoiding whiteli
The reason I'm saying it should be disabled by default is that this uses
hint syntax, and hints should really not change query semantics.
I'm quite strongly against hints that change query semantics, but if we
disable this by default I would be (reluctantly) OK with the feature.
Companies that
Hi, everyone ~
@Aljoscha @Timo
> I think we're designing ourselves into ever more complicated corners
here
I kindly agree that, personally didn't see strong reasons why we should limit
on each connector properties:
• we can define any table options for CREATE TABLE, why we treat the dynamic
o
Hi everyone,
@Aljoscha: I disagree with your approach because a `Catalog` can return
a custom factory that is not using any properties. The hinting must be
transparent to a factory. We should NOT modify the metadata
`CatalogTable` at any point in time after the catalog.
@Danny, @Jingsong: Ho
I think we're designing ourselves into ever more complicated corners
here. Maybe we need to take a step back and reconsider. What would you
think about this (somewhat) simpler proposal:
- introduce a hint called CONNECTOR_OPTIONS(k=v,...). or
CONNECTOR_PROPERTIES, depending on what naming we w
Hi Dawid,
> When a factory is instantiated it has access to the CatalogTable,
therefore it has access to all the original properties. In turn it knows
the original format and can call FormatFactory#supportedHintOptions().
Factory can only get CatalogTable when creating source or sink,
right? IIUC
Hi, Dawid, thanks so much for the detail suggestions ~
1. Regarding the motivation:
I agree it's not a good suggested way based on the fact that we have better
solution, but i think we can support override that as long as it exists as one
of the the table options. I would remove if from the mot
Hi,
Few comments from my side:
1. Regarding the motivation:
I think the example with changing the update-mode is not a good one. In
the long term this should be done with EMIT CHANGELOG (discussed in
FLIP-105).
Nitpicking: I would mention that it is rather for debugging/ad-hoc
solution. I think
Sorry, i meant white-list ~
Danny Chan 于2020年3月27日周五 下午12:40写道:
> Thanks everyone for the feedback ~
>
> - For the global config option belongs to `ExecutionConfigOptions` or
> `OptimizerConfigOptions`, i have to strong objections, switch
> to `OptimizerConfigOptions` is okey to me and i have up
Thanks everyone for the feedback ~
- For the global config option belongs to `ExecutionConfigOptions` or
`OptimizerConfigOptions`, i have to strong objections, switch
to `OptimizerConfigOptions` is okey to me and i have updated the WIKI
- For use while-list or black-list, i have opinion with Timo,
Hi everyone,
it is not only about security concerns. Hint options should be
well-defined. We had a couple of people that were concerned about
changing the semantics with a concept that is called "hint". These
options are more like "debugging options" while someone is developing a
connector or
Thanks Danny for update.
+1 to "dynamic-table-options.enabled" should belong to
`*OptimizerConfigOptions*`.
Hint looks like optimizer in my opinion. Actually optimizer affect
execution, but it is optimizer, not directly related to execution.
+1 to `unsupportedHintOptions`, we already list all opt
Thanks Jark for the feedback ~
I actually have a discussion offline with Timo and we think the black-list
options has implicit rick with the growing new table options, a black-list
there means all the new introduced options are default to be configurable
dynamically, if the user forget to add it i
Thanks Kurt for the suggestion ~
In my opinion:
- There is no need for TableFormatFactory#supportedHintOptions because all
the format options can be configured dynamically, they have no security
issues
- Dynamic table options is not an optimization, it is more like an
execution behavior from my si
Hi Danny,
Regarding to `supportedHintOptions()` interface, I suggest to use the
inverted version, `unsupportedHintOptions()`.
Because I think the disallowed list is much smaller.
In addition, it's hard to list all the properties under
`connector.properties.*`.
But we know `connector.properties.boo
Hi Danny,
Thanks for the updates. I have 2 comments regarding to latest document:
1) I think we also need `*supportedHintOptions*` for `*TableFormatFactory*`
2) IMO "dynamic-table-options.enabled" should belong to `
*OptimizerConfigOptions*`
Best,
Kurt
On Thu, Mar 26, 2020 at 4:40 PM Timo Wal
Thanks for the update Danny. +1 for this proposal.
Regards,
Timo
On 26.03.20 04:51, Danny Chan wrote:
Thanks everyone who engaged in this discussion ~
Our goal is "Supports Dynamic Table Options for Flink SQL". After an
offline discussion with Kurt, Timo and Dawid, we have made the final
concl
Thanks everyone who engaged in this discussion ~
Our goal is "Supports Dynamic Table Options for Flink SQL". After an
offline discussion with Kurt, Timo and Dawid, we have made the final
conclusion, here is the summary:
- Use comment style syntax to specify the dynamic table options: "/*+
Hi everyone,
Sorry, but I'm not sure about the `supportedHintOptions`. I'm afraid it
doesn't solve the problems but increases some development and learning
burdens.
# increase development and learning burden
According to the discussion so far, we want to support overriding a subset
of options in
My POC is here for the hints options merge [1].
Personally, I have no strong objections for splitting hints with the
CatalogTable, the only cons is a more complex implementation but the concept is
more clear, and I have updated the WIKI.
I think it would be nice if we can support the format “ig
Sorry maybe I didn't make myself clear. I think some format property is
very suitable to
be hinted, like "ignore errors during parsing". Maybe we should have a
dedicated
Hintable interface, and have `supportedHintOptions` method inside. All
factories supports
hint could implement from it.
Best,
K
Hi everyone,
+1 to Kurt's suggestion. Let's just have it in source and sink factories
for now. We can still move this method up in the future. Currently, I
don't see a need for catalogs or formats. Because how would you target a
format in the query?
@Danny: Can you send a link to your PoC? I
Hi,
I am thinking we can provide hints to *table* related instances.
- TableFormatFactory: of cause we need hints support, there are many format
options in DDL too.
- catalog and module: I don't know, maybe in future we can provide some
hints for them.
Best,
Jingsong Lee
On Wed, Mar 18, 2020 at
Yes, I think we should move the `supportedHintOptions` from TableFactory to
TableSourceFactory, and we also need to add the interface to TableSinkFactory
though because sink target table may also have hints attached.
Best,
Danny Chan
在 2020年3月18日 +0800 AM11:08,Kurt Young ,写道:
> Have one question
Have one question for adding `supportedHintOptions` method to
`TableFactory`. It seems
`TableFactory` is a base factory interface for all *table module* related
instances, such as
catalog, module, format and so on. It's not created only for *table*. Is it
possible to move it
to `TableSourceFactory`
Thanks Timo ~
For the naming itself, I also think the PROPERTIES is not that concise, so +1
for OPTIONS (I had thought about that, but there are many codes in current
Flink called it properties, i.e. the DescriptorProperties,
#getSupportedProperties), let’s use OPTIONS if this is our new prefer
Hi Danny,
thanks for updating the FLIP. I think your current design is sufficient
to separate hints from result-related properties.
One remark to the naming itself: I would vote for calling the hints
around table scan `OPTIONS('k'='v')`. We used the term "properties" in
the past but since we
@Danny sounds good.
Maybe it is worth listing all the classes of problems that you want to
address and then look at each class and see if hints are a good default
solution or a good optional way of simplifying things?
The discussion has grown a lot and it is starting to be hard to distinguish
the
Thanks Stephan ~
We can remove the support for properties that may change the semantics of
query if you think that is a trouble.
How about we support the /*+ properties() */ hint only for those optimize
parameters, such as the fetch size of source or something like that, does
that make sense?
St
I think Bowen has actually put it very well.
(1) Hints that change semantics looks like trouble waiting to happen. For
example Kafka offset handling should be in filters. The Kafka source should
support predicate pushdown.
(2) Hints should not be a workaround for current shortcomings. A lot of th
It seems this FLIP's name is somewhat misleading. From my understanding,
this FLIP is trying to
address the dynamic parameter issue, and table hints is the way we wan to
choose. I think we should
be focus on "what's the right way to solve dynamic property" instead of
discussing "whether table
hints
Thanks Aljoscha ~
I agree for most of the query hints, they are optional as an optimizer
instruction, especially for the traditional RDBMS.
But, just like BenChao said, Flink as a computation engine has many different
kind of data sources, thus, dynamic parameters like start_offest can only bin
Hi all,
Thanks Danny for bring up this great discussion, generally hints is a great
feature for SQL.
And the discussions are very insightful, I'd like to share some ideas here.
About error handling,
+1 to throw exception by default. IMHO, hints are parts of the query, which
should be validated be
A quick summary that focus of the discussion now shifts to be whether
semantic params like kafka 'starting offset' should be table
hints/properties, and if so, in what form.
I strongly believe the action of setting offset should *not* be part of a
table, neither hints nor properties, for all the g
Hi,
I don't understand this discussion. Hints, as I understand them, should
work like this:
- hints are *optional* advice for the optimizer to try and help it to
find a good execution strategy
- hints should not change query semantics, i.e. they should not change
connector properties executi
Thanks Timo for summarize the 3 options ~
I agree with Kurt that option2 is too complicated to use because:
• As a Kafka topic consumer, the user must define both the virtual column for
start offset and he must apply a special filter predicate after each query
• And for the internal implementati
Hi Timo,
option 1 & option 3 are almost the same in my opinion. Even though we call
it table hints, the biggest
motivation for now is to modify table properties. I also see other vendor
using syntax like option 1 to
implement table hints, e.g. sql server [1]. It uses syntax like SELECT *
from T WI
Hi Danny,
it is true that our DDL is not standard compliant by using the WITH
clause. Nevertheless, we aim for not diverging too much and the LIKE
clause is an example of that. It will solve things like overwriting
WATERMARKs, add additional/modifying properties and inherit schema.
Bowen is
Thanks Bowen ~
I agree we should somehow categorize our connector parameters.
For type1, I’m already preparing a solution like the Confluent schema registry
+ Avro schema inference thing, so this may not be a problem in the near future.
For type3, I have some questions:
> "SELECT * FROM mykafk
Thanks Danny for kicking off the effort
The root cause of too much manual work is Flink DDL has mixed 3 types of
params together and doesn't handle each of them very well. Below are how I
categorize them and corresponding solutions in my mind:
- type 1: Metadata of external data, like external en
If a specific connector want to have such parameter and read if out of
configuration, then that's fine.
If we are talking about a configuration for all kinds of sources, I would
be super careful about that.
It's true it can solve maybe 80% cases, but it will also make the left 20%
feels weird.
Bes
Hi Kurt,
#3 Regarding to global offset:
I'm not saying to use the global configuration to override connector
properties by the planner.
But the connector should take this configuration and translate into their
client API.
AFAIK, almost all the message queues support eariliest and latest and a
time
Good to have such lovely discussions. I also want to share some of my
opinions.
#1 Regarding to error handling: I also think ignore invalid hints would be
dangerous, maybe
the simplest solution is just throw an exception.
#2 Regarding to property replacement: I don't think we should constraint
ou
Thanks Timo and Fabian ~
compared to the hints, FLIP-110 is fully compliant to the SQL standard.
I don't think so, here is the syntax of CREATE TABLE in SQL-2011 IWD
9075-2:201?(E) 11.3 :
::=
CREATE [ ] TABLE
[ WITH ]
[ ON COMMIT ROWS ]
::=
|
|
::=
[ { }.
Hi everyone,
I want to jump in the discussion about the "dynamic start offset" problem.
First of all, I share the same concern with Timo and Fabian, that the
"start offset" affects the query semantics, i.e. the query result.
But "hints" is just used for optimization which should affect the result?
Hi Danny,
compared to the hints, FLIP-110 is fully compliant to the SQL standard.
I don't think that `CREATE TEMPORARY TABLE Temp (LIKE t) WITH (k=v)` is
too verbose or awkward for the power of basically changing the entire
connector. Usually, this statement would just precede the query in a
Hi,
RE error handling:
I don't think it is a good idea to simply log invalid hints. This makes it
very hard for systems that integrate Flink to consume these errors.
I'm not saying we should throw an exception. We could also execute a query
with invalid hints but should have a mechanism to provide
Thanks Timo ~
Personally I would say that offset > 0 and start offset = 10 does not have the
same semantic, so from the SQL aspect, we can not implement a “starting offset”
hint for query with such a syntax.
And the CREATE TABLE LIKE syntax is a DDL which is just verbose for defining
such dyna
Hi Danny,
shouldn't FLIP-110[1] solve most of the problems we have around defining
table properties more dynamically without manual schema work? Also
offset definition is easier with such a syntax. They must not be defined
in catalog but could be temporary tables that extend from the original
Thanks Wenlong ~
For PROPERTIES Hint Error handling
Actually we have no way to figure out whether a error prone hint is a
PROPERTIES hint, for example, if use writes a hint like ‘PROPERTIAS’, we do not
know if this hint is a PROPERTIES hint, what we know is that the hint name was
not registere
Hi Danny, thanks for the proposal. +1 for adding table hints, it is really
a necessary feature for flink sql to integrate with a catalog.
For error handling, I think it would be more natural to throw an
exception when error table hint provided, because the properties in hint
will be merged and use
To Weike: About the Error Handing
To be consistent with other SQL vendors, the default is to log warnings and if
there is any error (invalid hint name or options), the hint is just ignored. I
have already addressed in the wiki.
To Timo: About the PROPERTIES Table Hint
• The properties hints is
Hi Danny,
thanks for the proposal. I agree with Jark and Jingsong. Planner hints
and table hints are orthogonal topics that should be discussed separately.
I share Jingsong's opinion that we should not use planner hints for
passing connector properties. Planner hints should be optional at any
Hi Danny, +1 for table hints, thanks for driving.
I took a look to FLIP, most of content are talking about query hints. It is
hard to discussion and voting. So +1 to split it as Jark said.
Another thing is configuration that suitable to config with table hints:
"connector.path" and "connector.top
Thanks Danny for starting the discussion.
+1 for this feature.
If we just focus on the table hints not the query hints in this release,
could you split the FLIP into two FLIPs?
Because it's hard to vote on partial part of a FLIP. You can keep the table
hints proposal in FLIP-113 and move query hin
Hi Danny,
This is a nice feature, +1.
One thing I am interested in but not mentioned in the proposal is the error
handling, as it is quite common for users to write inappropriate hints in
SQL code, if illegal or "bad" hints are given, would the system simply
ignore them or throw exceptions?
Than
Note:
we only plan to support table hints in Flink release 1.11, so please focus
mainly on the table hints part and just ignore the planner hints, sorry for
that mistake ~
Best,
Danny Chan
在 2020年3月9日 +0800 PM4:36,Danny Chan ,写道:
> Hi, fellows ~
>
> I would like to propose the supports for SQL h
Hi, fellows ~
I would like to propose the supports for SQL hints for our Flink SQL.
We would support hints syntax as following:
select /*+ NO_HASH_JOIN, RESOURCE(mem='128mb', parallelism='24') */
from
emp /*+ INDEX(idx1, idx2) */
join
dept /*+ PROPERTIES(k1='v1', k2='v2') */
on
emp.deptno = dept
68 matches
Mail list logo