Hi Lincoln,

thanks for proposing the FLIP. The general idea to expose the target columns in DynamicTableSink#Context sounds good to me.

In the FLIP I found the JavaDoc a bit confusing:

```
The column list will be empty for 'insert into target select ...'.
```

This could mean both optional empty or array empty. Maybe you can rephrase that a bit in the implementation.

Otherwise +1.

Timo


On 08.03.23 14:00, Lincoln Lee wrote:
Hi Jing,
Agree with you that using formal terms can be easier to users, I've updated
the FLIP[1], since this is only one of the application scenarios for
partial insert, our java doc for the corresponding interface will describe
the partial insert message itself from a generic point of view, WDTY?

@Jacky thanks for your feedback!
here are my thoughts for the two questions:
for this scenario, I don't think the planner should report an error. We
cannot assume that such usage will necessarily result in errors or that
users are unaware of potential risks (just like in a database, similar
operations are not prompted with errors). In the streaming scenario,
regarding the risks associated with the multi-insert operation with
overlapping fields, we may consider expanding the plan advice (FLIP-280 has
just added possibilities to support this) to prompt users instead of
rejecting the operation with an error.
1. if the two insert into with same columns, the result is not
nondeterminism. will it check in planner and throw exception

yes, not all connectors support partial insert. Therefore, the introduction
of this interface is only intended as additional information for the
connectors that need it. The new `targetColumns` only provide the column
list information corresponding to the statement according to the SQL
standard, and existing connectors do not need to make any passive changes
by default.
2. some sink connectors can not supports it like queue such as kafka
compacted topic. will also it check in planner  and throw exception

welcome your feedback!


[1]
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240885081


Best,
Lincoln Lee


Jacky Lau <liuyong...@gmail.com> 于2023年3月8日周三 20:11写道:

Thanks for bringing this up. this is a good feature. but i have two
questions:
1. if the two insert into with same columns, the result is
not  nondeterminism. will it check in planner and throw exception
2. some sink connectors can not supports it like queue such as kafka
compacted topic. will also it check in planner  and throw exception

Lincoln Lee <lincoln.8...@gmail.com> 于2023年3月7日周二 14:53写道:

Hi Aitozi,

Thanks for your feedback!  Yes, including HBase and JDBC connector, they
can be considered for support in the next step (JDBC as as a standard
protocol supported not only in traditional databases, but also in more
and
more new types of storage). Considering the ongoing externalizing of
connectors and the release cycles of the connectors are decoupled with
the
release cycle of Flink, we can initiate corresponding support issues for
specific connectors to follow up on support after finalizing the API
changes, WDYT?

Best,
Lincoln Lee


Hang Ruan <ruanhang1...@gmail.com> 于2023年3月7日周二 12:14写道:

Hi, Lincoln,

Thanks for bringing this up. It looks good to me. I also agree with
Jingsong's suggestion.

Best,
Hang

Jingsong Li <jingsongl...@gmail.com> 于2023年3月7日周二 11:15写道:

Wow, we have 300 FLIPs...

Thanks Lincoln,

Have you considered returning an Optional<int[][]>?

Empty array looks a little weird to me.

Best,
Jingsong

On Tue, Mar 7, 2023 at 10:32 AM Aitozi <gjying1...@gmail.com> wrote:

Hi Lincoln,
     Thank you for sharing this FLIP. Overall, it looks good to me.
I
have
one question: with the introduction of this interface,
will any existing Flink connectors need to be updated in order to
take
advantage of its capabilities? For example, HBase.

yuxia <luoyu...@alumni.sjtu.edu.cn> 于2023年3月7日周二 10:01写道:

Thanks. It makes sense to me.

Best regards,
Yuxia

----- 原始邮件 -----
发件人: "Lincoln Lee" <lincoln.8...@gmail.com>
收件人: "dev" <dev@flink.apache.org>
发送时间: 星期一, 2023年 3 月 06日 下午 10:26:26
主题: Re: [DISCUSS] FLIP-300: Add targetColumns to
DynamicTableSink#Context
to solve the null overwrite problem of partial-insert

hi yuxia,

Thanks for your feedback and tracking the issue of update
statement!
I've
updated the FLIP[1] and also the poc[2].
Since the bug and flip are orthogonal, we can focus on finalizing
the
api
changes first, and then work on the flip implementation and
bugfix
separately, WDYT?

[1]




https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240885081
[2] https://github.com/apache/flink/pull/22041

Best,
Lincoln Lee


yuxia <luoyu...@alumni.sjtu.edu.cn> 于2023年3月6日周一 21:21写道:

Hi, Lincoln.
Thanks for bringing this up. +1 for this FLIP, it's helpful for
external
storage system to implement partial update.
The FLIP looks good to me. I only want to add one comment,
update
statement also doesn't support updating nested column, I have
created
FLINK-31344[1] to track it.
Maybe we also need to explain it in this FLIP.

[1] https://issues.apache.org/jira/browse/FLINK-31344

Best regards,
Yuxia

----- 原始邮件 -----
发件人: "Lincoln Lee" <lincoln.8...@gmail.com>
收件人: "dev" <dev@flink.apache.org>
发送时间: 星期五, 2023年 3 月 03日 下午 12:22:19
主题: [DISCUSS] FLIP-300: Add targetColumns to
DynamicTableSink#Context to
solve the null overwrite problem of partial-insert

Hi everyone,

This FLIP[1] aims to support connectors in avoiding overwriting
non-target
columns with null values when processing partial column
updates,
we
propose
adding information on the target column list to
DynamicTableSink#Context.

FLINK-18726[2] supports inserting statements with specified
column
list,
it
fills null values (or potentially declared default values in
the
future)
for columns not appearing in the column list of insert
statement
to
the
target table.
But this behavior does not satisfy some partial column update
requirements
of some storage systems which allow storing null values. The
problem
is
that connectors cannot distinguish whether the null value of a
column is
really from the user's data or whether it is a null value
populated
because
of partial insert behavior.

Looking forward to your comments or feedback.

[1]





https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=240885081
[2] https://issues.apache.org/jira/browse/FLINK-18726

Best,
Lincoln Lee








Reply via email to