[ 
https://issues.apache.org/jira/browse/FLINK-27625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

lincoln lee updated FLINK-27625:
--------------------------------
    Description: 
The hint name discuss thread: 
https://lists.apache.org/thread/jm9kg33wk9z2bvo2b0g5bp3n5kfj6qv8

FLINK-27623 adds a global parameter 'table.exec.async-lookup.output-mode' for 
table users so that all three control parameters related to async I/O can be 
configured at the same job level.
As planned in the issue, we‘d like to go a step further to offer more precise 
control for async join operation more than job level config, to introduce a new 
join hint: ‘ASYNC_LOOKUP’.

For the hint option, for intuitive and user-friendly reasons, we want to 
support both simple and kv forms, with all options except table name being 
optional (use job level configuration if not set)

# 1. simple form: (ordered hint option list)
```
ASYNC_LOOKUP('tableName'[, 'output-mode', 'buffer-capacity', 'timeout'])
optional:
output-mode
buffer-capacity
timeout
```

Note: since Calcite currently does not support the mixed type hint options,
the table name here needs to be a string instead of an identifier. (For
`SqlHint`: The option format can not be mixed in, they should either be all
simple identifiers or all literals or all key value pairs.) We can improve
this after Calcite support.

# 2. kv form: (support unordered hint option list)
```
ASYNC_LOOKUP('table'='tableName'[, 'output-mode'='ordered|allow-unordered',
'capacity'='int', 'timeout'='duration'])

optional kvs:
'output-mode'='ordered|allow-unordered'
'capacity'='int'
'timeout'='duration'
```

e.g., if the job level configuration is:
```
table.exec.async-lookup.output-mode: ORDERED
table.exec.async-lookup.buffer-capacity: 100
table.exec.async-lookup.timeout: 180s
```

then the following hints:
```
1. ASYNC_LOOKUP('dim1', 'allow-unordered', '200', '300s')
2. ASYNC_LOOKUP('dim1', 'allow-unordered', '200')
3. ASYNC_LOOKUP('table'='dim1', 'output-mode'='allow-unordered')
4. ASYNC_LOOKUP('table'='dim1', 'timeout'='300s')
5. ASYNC_LOOKUP('table'='dim1', 'capacity'='300')
```

are equivalent to:
```
1. ASYNC_LOOKUP('dim1', 'allow-unordered', '200', '300s')
2. ASYNC_LOOKUP('dim1', 'allow-unordered', '200', '180s')
3. ASYNC_LOOKUP('table'='dim1', 'output-mode'='allow-unordered',
'capacity'='100', 'timeout'='180s')
4. ASYNC_LOOKUP('table'='dim1', 'output-mode'='ordered', 'capacity'='100',
'timeout'='300s')
5. ASYNC_LOOKUP('table'='dim1', 'output-mode'='ordered', 'capacity'='300',
'timeout'='180s')
```

In addition, if the lookup source implements both sync and async table
function, the planner prefers to choose the async function when the
'ASYNC_LOOKUP' hint is specified.




  was:
Add query hint for async lookup join for join level control:
e.g., 
{code}
// ordered mode
ASYNC_LOOKUP(dim1, 'ordered', '100', '180s')
// unordered mode
ASYNC_LOOKUP(dim1, 'allow-unordered', '100', '180s')
{code}

TODO: The hint name should be discussed in ML.


> Add query hint for async lookup join
> ------------------------------------
>
>                 Key: FLINK-27625
>                 URL: https://issues.apache.org/jira/browse/FLINK-27625
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table SQL / API
>            Reporter: lincoln lee
>            Assignee: lincoln lee
>            Priority: Major
>             Fix For: 1.16.0
>
>
> The hint name discuss thread: 
> https://lists.apache.org/thread/jm9kg33wk9z2bvo2b0g5bp3n5kfj6qv8
> FLINK-27623 adds a global parameter 'table.exec.async-lookup.output-mode' for 
> table users so that all three control parameters related to async I/O can be 
> configured at the same job level.
> As planned in the issue, we‘d like to go a step further to offer more precise 
> control for async join operation more than job level config, to introduce a 
> new join hint: ‘ASYNC_LOOKUP’.
> For the hint option, for intuitive and user-friendly reasons, we want to 
> support both simple and kv forms, with all options except table name being 
> optional (use job level configuration if not set)
> # 1. simple form: (ordered hint option list)
> ```
> ASYNC_LOOKUP('tableName'[, 'output-mode', 'buffer-capacity', 'timeout'])
> optional:
> output-mode
> buffer-capacity
> timeout
> ```
> Note: since Calcite currently does not support the mixed type hint options,
> the table name here needs to be a string instead of an identifier. (For
> `SqlHint`: The option format can not be mixed in, they should either be all
> simple identifiers or all literals or all key value pairs.) We can improve
> this after Calcite support.
> # 2. kv form: (support unordered hint option list)
> ```
> ASYNC_LOOKUP('table'='tableName'[, 'output-mode'='ordered|allow-unordered',
> 'capacity'='int', 'timeout'='duration'])
> optional kvs:
> 'output-mode'='ordered|allow-unordered'
> 'capacity'='int'
> 'timeout'='duration'
> ```
> e.g., if the job level configuration is:
> ```
> table.exec.async-lookup.output-mode: ORDERED
> table.exec.async-lookup.buffer-capacity: 100
> table.exec.async-lookup.timeout: 180s
> ```
> then the following hints:
> ```
> 1. ASYNC_LOOKUP('dim1', 'allow-unordered', '200', '300s')
> 2. ASYNC_LOOKUP('dim1', 'allow-unordered', '200')
> 3. ASYNC_LOOKUP('table'='dim1', 'output-mode'='allow-unordered')
> 4. ASYNC_LOOKUP('table'='dim1', 'timeout'='300s')
> 5. ASYNC_LOOKUP('table'='dim1', 'capacity'='300')
> ```
> are equivalent to:
> ```
> 1. ASYNC_LOOKUP('dim1', 'allow-unordered', '200', '300s')
> 2. ASYNC_LOOKUP('dim1', 'allow-unordered', '200', '180s')
> 3. ASYNC_LOOKUP('table'='dim1', 'output-mode'='allow-unordered',
> 'capacity'='100', 'timeout'='180s')
> 4. ASYNC_LOOKUP('table'='dim1', 'output-mode'='ordered', 'capacity'='100',
> 'timeout'='300s')
> 5. ASYNC_LOOKUP('table'='dim1', 'output-mode'='ordered', 'capacity'='300',
> 'timeout'='180s')
> ```
> In addition, if the lookup source implements both sync and async table
> function, the planner prefers to choose the async function when the
> 'ASYNC_LOOKUP' hint is specified.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to