Re: Identify orphan records after joining two streams

2019-04-26 Thread Averell
Hi Dawid, I just tried to change from CoProcessFunction with onTimer() to ProcessWindowFunction with Trigger and TumblingWindow. So I can key my stream by (id) instead of (id, eventTime). With this, I can use /reinterpretAsKeyedStream/, and hope that it would give better performance. I can also us

Re: Identify orphan records after joining two streams

2019-04-25 Thread Dawid Wysakowicz
Hi Averell, I think your original solution is the right one, given your requirements. I don't think it is over complicated. As for the memory concerns, there is no bult-in mechanism for backpressure/alignment based on event time. The community did take that into consideration when discussing the

Re: Identify orphan records after joining two streams

2019-04-18 Thread Averell
Thank you Hecheng. I just tried to use Table API as your suggestion, and it almost worked (it worked with two issues here below): - I only get the output when my event-time watermark goes pass the end of the tumbling window. But, because I know that there are maximum 2 records per window (one

Re: Identify orphan records after joining two streams

2019-04-15 Thread Hequn Cheng
Hi Averell, > I feel that it's over-complicated I think a Table API or SQL[1] job can also achieve what you want. Probably more simple and takes up less code. The workflow looks like: 1. union all two source tables. You may need to unify the schema of the two tables as union all can only used to u

Identify orphan records after joining two streams

2019-04-15 Thread Averell
Hello, I have two data streams, and want to join them using a tumbling window. Each of the streams would have at most one record per window. There is also a requirement to log/save the records that don't have a companion from the other stream. What would be the best option for my case? Would that