[ 
https://issues.apache.org/jira/browse/FLINK-17178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17087697#comment-17087697
 ] 

Lijie Wang commented on FLINK-17178:
------------------------------------


About cache all,I think there is a real need. In some cases, dimension table 
changes slowly, so I think users willing to tolerate lateness to get more 
performance. 

But I'm not sure which implementation is better. “Provided All Cache Strategy 
in LookupFunction” , or "scan the data into state and then do the join" as 
mentioned above.

>  Provide "ALL" cache strategy in LookupFunction
> -----------------------------------------------
>
>                 Key: FLINK-17178
>                 URL: https://issues.apache.org/jira/browse/FLINK-17178
>             Project: Flink
>          Issue Type: New Feature
>          Components: Connectors / Common
>            Reporter: Lijie Wang
>            Priority: Major
>
> We provide "ALL" cache strategy mentioned in FLINK-13252, motivation as 
> follow:
> Maintain the entire dimension table in memory to improve performance. There 
> is no IO overhead when we lookup the cached table. Reload dimension table 
> periodically for update, and we can reload asynchronously with little IO 
> delay.
> The cache needs to be reloaded periodically for update。
> Limitations:
> 1.  It's suitable for scenario that users don't care the lateness so much, 
> periodically updating can satisfy them.
> 2.  The “ALL” cache needs more memory, so it's suitable for small dimension 
> table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to