swuferhong commented on code in PR #20513: URL: https://github.com/apache/flink/pull/20513#discussion_r948852444
########## docs/content/docs/dev/table/sql/queries/hints.md: ########## @@ -84,4 +84,205 @@ insert into kafka_table1 /*+ OPTIONS('sink.partitioner'='round-robin') */ select ``` +## Query Hints + +`Query Hints` specify that the indicated hints are used in the scope of a query. They affect all operators in the statement. +Now, Flink `Query Hints` support `Join Hints`. + +### Syntax +The `Query Hints` syntax in Flink: +```sql +# Query Hints: +SELECT /*+ hint [, hint ] */ ... + +hint: + hintName + | hintName '(' optionKey '=' optionVal [, optionKey '=' optionVal ]* ')' + | hintName '(' hintOption [, hintOption ]* ')' + +optionKey: + simpleIdentifier + | stringLiteral + +optionVal: + stringLiteral + +hintOption: + simpleIdentifier + | numericLiteral + | stringLiteral +``` + +### Join Hints + +Join hints allow users to suggest the join strategy to optimizer in order to get a more optimal execution plan. +Now Flink `Join Hints` support `BROADCAST`, `SHUFFLE_HASH`, `SHUFFLE_MERGE` and `NEST_LOOP` Join Hints. Also, for same cases, +`Join Hints` will conflict, we try to formulate the behavior of Join Hints for these cases. + +#### BROADCAST Join Hints Review Comment: > `BROADCAST` or `BROADCAST Join Hint` Done! ########## docs/content/docs/dev/table/sql/queries/hints.md: ########## @@ -84,4 +84,205 @@ insert into kafka_table1 /*+ OPTIONS('sink.partitioner'='round-robin') */ select ``` +## Query Hints + +`Query Hints` specify that the indicated hints are used in the scope of a query. They affect all operators in the statement. +Now, Flink `Query Hints` support `Join Hints`. + +### Syntax +The `Query Hints` syntax in Flink: +```sql +# Query Hints: +SELECT /*+ hint [, hint ] */ ... + +hint: + hintName + | hintName '(' optionKey '=' optionVal [, optionKey '=' optionVal ]* ')' + | hintName '(' hintOption [, hintOption ]* ')' + +optionKey: + simpleIdentifier + | stringLiteral + +optionVal: + stringLiteral + +hintOption: + simpleIdentifier + | numericLiteral + | stringLiteral +``` + +### Join Hints + +Join hints allow users to suggest the join strategy to optimizer in order to get a more optimal execution plan. +Now Flink `Join Hints` support `BROADCAST`, `SHUFFLE_HASH`, `SHUFFLE_MERGE` and `NEST_LOOP` Join Hints. Also, for same cases, +`Join Hints` will conflict, we try to formulate the behavior of Join Hints for these cases. + +#### BROADCAST Join Hints + +{{< label Batch >}} + +For this type of `Join Hints`, suggests that join use `BroadCast join` strategy. the join side with the hint will be broadcast to +each downstream operator, and join side without hint will be sent directly to the downstream operator with Forward. + +##### Examples + +*NOTE:* `BROADCAST Join Hints` only supports join with equivalence join condition. And it doesn't support `Full Outer Join`. +```sql +CREATE TABLE t1 (id BIGINT, name STRING, age INT) WITH (...); +CREATE TABLE t2 (id BIGINT, name STRING, age INT) WITH (...); +CREATE TABLE t3 (id BIGINT, name STRING, age INT) WITH (...); + +-- Select t1 as join side with this hint, t1 will be the broadcast table. +SELECT /*+ BROADCAST(t1) */ * FROM t1 JOIN t2 ON t1.id = t2.id; + +-- Select t1 and t3 as join side with this hint. +-- For this case, when join between t1 and t2, t1 will be the broadcast table. +-- And when join between the result after t1 joins t2 and t3, t3 will be the broadcast table. +SELECT /*+ BROADCAST(t1, t3) */ * FROM t1 JOIN t2 ON t1.id = t2.id JOIN t3 ON t1.id = t3.id; + +-- BROADCAST Join Hints don't support non-equivalent join conditions. +-- For this case, default join strategy will be adopted instead of BroadCast join. +SELECT /*+ BROADCAST(t1) */ * FROM t1 join t2 ON t1.id > t2.id; + +-- BROADCAST Join Hints strategy don't support full outer join. +-- For this case, default join strategy will be adopted instead of BroadCast join. +SELECT /*+ BROADCAST(t1) */ * FROM t1 FULL OUTER JOIN t2 ON t1.id = t2.id; +``` + +#### SHUFFLE_HASH Join Hints Review Comment: > ditto Done! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org