Hi,

Yes, using a TwoInputStreamOperator (or even better, a CoProcessFunction, 
because TwoInputStreamOperator is a low-level interface that might change in 
the future) is the recommended way for implementing a stream-stream join, 
currently.

As you already guessed, you need a policy for cleanup up the state that you 
hold. You can do this using the timer features of CoProcessFunction.

Also, if you keep your buffered elements using the Flink state interfaces you 
can switch the state backend to the RocksDB backend and if you have concerns 
about the state growing too big.

Best,
Aljoscha

> On 9. Oct 2017, at 23:11, Yan Zhou [FDS Science] ­ <yz...@coupang.com> wrote:
> 
> It seems like flink only supports DataStream joining within same time window. 
> Why is it restricted in this way? 
> 
> I think I can implement a TwoInputStreamOperator to join two DataStreams 
> without considering the window.  And inside the operator, create two state to 
> cache records of two streams and join the streams within methods 
> processElement1/processElement2. Should I go head with this approach? Is 
> there any performance consideration here? If the concern is that the cache 
> might take a lot of memory, we can introduce some cache policy and reduce the 
> size. Or can we use rocksDB state?
> 
> Please advise.
> 
> Best
> Yan
>  

Reply via email to