Re: Event time join

2018-03-13 Thread Fabian Hueske
Hi, A Flink application does not have a problem if it ingests two streams with very different throughput as long as they are somewhat synced on their event-time. This is typically the case when ingesting real-time data. In such scenarios, an application would not buffer more data than necessary.

Re: Event time join

2018-03-09 Thread Gytis Žilinskas
Thanks for the answers and discussion both of you. The FLIP mentions that the cases where one stream is much faster than the other one, will not be handled for now either, so I guess it would still not solve our problems. As for the join semantics itself, I think we achieve the same thing with CoP

Re: Event time join

2018-03-08 Thread Vishal Santoshi
Yep. I think this leads to this general question and may be not pertinent to https://github.com/apache/flink/pull/5342. How do we throttle a source if the held back data gets unreasonably large ? I know that that is in itself a broader question but delayed watermarks of slow stream accentuates th

Re: Event time join

2018-03-08 Thread Fabian Hueske
The join would not cause backpressure but rather put all events that cannot be processed yet into state to process them later. So this works well if the data that is provided by the streams is roughly aligned by event time. 2018-03-08 9:04 GMT-08:00 Vishal Santoshi : > Aah we have it here https:/

Re: Event time join

2018-03-08 Thread Vishal Santoshi
Aah we have it here https://docs.google.com/document/d/16GMH5VM6JJiWj_N0W8y3PtQ1aoJFxsKoOTSYOfqlsRE/edit#heading=h.bgl260hr56g6 On Thu, Mar 8, 2018 at 11:45 AM, Vishal Santoshi wrote: > This is very interesting. I would imagine that there will be high back > pressure on the LEFT source effectiv

Re: Event time join

2018-03-08 Thread Vishal Santoshi
This is very interesting. I would imagine that there will be high back pressure on the LEFT source effectively throttling it but as is the current state that is likely effect other pipelines as the free o/p buffer on the source side and and i/p buffers on the consumer side start blocking and get e

Re: Event time join

2018-03-08 Thread Fabian Hueske
Hi Gytis, Flink does currently not support holding back individual streams, for example it is not possible to align streams on (offset) event-time. However, the Flink community is working on a windowed join for the DataStream API, that only holds the relevant tail of the stream as state. If your