Hi Rex,

Currently, the join operator may use 3 kinds of state structure depending
on the input key and join key information.

1) input doesn't have a unique key => MapState<row, count>,
where the map key is the input row and the map value is the number of equal
rows.

2) input has unique key, but the unique key is not a subset of join key =>
MapState<UK, row>
this is better than the above one, because it has a shorter map key and
is more efficient when retracting records.

3) input has a unique key, and the unique key is a subset of join key =>
ValueState<row>
this is the best performance, because it only performs a "get" operation
rather than "seek" on rocksdb
 for each record of the other input side.

Note: the join key is the key of the keyed states.

You can see the implementation differences
in 
org.apache.flink.table.runtime.operators.join.stream.state.JoinRecordStateViews.

Best,
Jark

On Wed, 18 Nov 2020 at 02:30, Rex Fenley <r...@remind101.com> wrote:

> Ok, what are the performance consequences then of having a join with
> NoUniqueKey if the left side's key actually is unique in practice?
>
> Thanks!
>
>
> On Tue, Nov 17, 2020 at 7:35 AM Jark Wu <imj...@gmail.com> wrote:
>
>> Hi Rex,
>>
>> Currently, the unique key is inferred by the optimizer. However, the
>> inference is not perfect.
>> There are known issues that the unique key is not derived correctly, e.g.
>> FLINK-20036 (is this opened by you?). If you think you have the same case,
>> please open an issue.
>>
>> Query hint is a nice way for this, but it is not supported yet.
>> We have an issue to track supporting query hint, see FLINK-17173.
>>
>> Beest,
>> Jark
>>
>>
>> On Tue, 17 Nov 2020 at 15:23, Rex Fenley <r...@remind101.com> wrote:
>>
>>> Hello,
>>>
>>> I have quite a few joins in my plan that have
>>>
>>> leftInputSpec=[NoUniqueKey]
>>>
>>> in Flink UI. I know this can't truly be the case that there is no unique
>>> key, at least for some of these joins that I've evaluated.
>>>
>>> Is there a way to hint to the join what the unique key is for a table?
>>>
>>> Thanks!
>>>
>>> --
>>>
>>> Rex Fenley  |  Software Engineer - Mobile and Backend
>>>
>>>
>>> Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>
>>>  |  FOLLOW US <https://twitter.com/remindhq>  |  LIKE US
>>> <https://www.facebook.com/remindhq>
>>>
>>
>
> --
>
> Rex Fenley  |  Software Engineer - Mobile and Backend
>
>
> Remind.com <https://www.remind.com/> |  BLOG <http://blog.remind.com/>  |
>  FOLLOW US <https://twitter.com/remindhq>  |  LIKE US
> <https://www.facebook.com/remindhq>
>

Reply via email to