Gustavo de Morais created FLINK-38444:
-----------------------------------------
Summary: Join calculates upsert keys with null keys for left join
Key: FLINK-38444
URL: https://issues.apache.org/jira/browse/FLINK-38444
Project: Flink
Issue Type: Bug
Components: Table SQL / Planner
Reporter: Gustavo de Morais
Fix For: 2.2.0
The planner currently considers a union of both the upsert keys from the left
and from the right to be a valid resulting upsert key. That's true for inner
joins but for left joins that leads to a resulting upsert key that contains
columns that can be null, which is not valid.
Example and simplified repro steps:
{code:java}
-- Table 'orders' with an incomplete primary key
CREATE TABLE `orders_with_composite_key` (
`order_id` BIGINT NOT NULL,
`user_id` INT NOT NULL,
`item_name` STRING,
CONSTRAINT `PRIMARY` PRIMARY KEY (`order_id`, `user_id`) NOT ENFORCED
);
-- Table 'users'
CREATE TABLE `users` (
`user_id` INT NOT NULL,
`other_data` STRING,
CONSTRAINT `PRIMARY` PRIMARY KEY (`user_id`) NOT ENFORCED
);
SELECT
o.user_id,
o.order_id,
u.user_id
FROM `users` AS u
LEFT JOIN `orders_with_composite_key` AS o
ON o.user_id = u.user_id
-- Bug: this incorrectly infers the following upsert join from the join
(user_id,order_id,user_id0)
{code}
This is not valid since the columns on the right can be null and can't be used
as part of an upsert key.
This is where we calculate the unique keys and where we need to consider the
nullability of the fields as we do for the other cases below
[https://github.com/apache/flink/blob/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/planner/plan/metadata/FlinkRelMdUniqueKeys.scala#L593]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)