Re: 【Flink Join内存问题】

admin Mon, 13 Jul 2020 19:55:35 -0700

regular join会缓存两边流的所有数据，interval join只存一段时间内的，相比当然节省很大的状态存储


> 2020年7月13日 下午10:30，忝忝向仧 <[email protected]> 写道：
> 
> Hi:
> 
> 
> interval join可以缓解key值过多问题么?
> interval join不也是计算某段时间范围内的join么，跟regular join相比，如何做到避免某个stream的key过多问题?
> 谢谢.
> 
> 
> 
> 
> ------------------&nbsp;原始邮件&nbsp;------------------
> 发件人:                                                                          
>                                               "user-zh"                       
>                                                              
> <[email protected]&gt;;
> 发送时间:&nbsp;2020年7月6日(星期一) 中午11:12
> 收件人:&nbsp;"user-zh"<[email protected]&gt;;
> 
> 主题:&nbsp;Re: 【Flink Join内存问题】
> 
> 
> 
> regular join确实是这样，所以量大的话可以用interval join 、temporal join
> 
> &gt; 2020年7月5日 下午3:50，忝忝向仧 <[email protected]&gt; 写道：
> &gt; 
> &gt; Hi,all:
> &gt; 
> &gt; 我看源码里写到JoinedStreams:
> &gt; 也就是说join时候都是走内存计算的，那么如果某个stream的key值过多，会导致oom
> &gt; 那么有什么预防措施呢?
> &gt; 将key值多的一边进行打散？
> &gt; 
> &gt; 
> &gt; Right now, the join is being evaluated in memory so you need to ensure 
> that the number
> &gt; * of elements per key does not get too high. Otherwise the JVM might 
> crash.

Re: 【Flink Join内存问题】

回复