Re: Understanding pipelined regions

2022-12-26 Thread Raihan Sunny
Thanks for the clarification Gen. On Fri, Dec 23, 2022, 14:20 Gen Luo wrote: > Hi Raihan, > If the job is a streaming job, all exchanges are pipeline exchanges. That > means the upstream and downstream tasks must be running at the same time > and exchange data realtimely. While blocking data exc

Re: Understanding pipelined regions

2022-12-23 Thread Gen Luo
Hi Raihan, If the job is a streaming job, all exchanges are pipeline exchanges. That means the upstream and downstream tasks must be running at the same time and exchange data realtimely. While blocking data exchange means the upstream task will store the result somewhere(on the local disk by defau

Re: Understanding pipelined regions

2022-12-21 Thread Raihan Sunny
Hello Gen, I do have a basic understanding of fault tolerance and how checkpointing is used to achieve it. What I'm confused about from reading the blog post [1] is what are considered blocking data exchanges. Is windowing in a streaming job considered a blocking data exchange? In a job with a DAG

Re: Understanding pipelined regions

2022-12-20 Thread Gen Luo
Hi Raihan, As the description of PipelinedRegion says, a pipelined region is a set of vertices connected via pipelined data exchanges. For example in a job with such a dag A->B, both of the tasks have two subtasks. If the edge between A and B is a forward edge, there are two pipelined regions: (A1

RE: Understanding pipelined regions

2022-12-20 Thread Schwalbe Matthias
Hi Sunny, Welcome to Flink 😊. The next thing for you to consider is to setup checkpointing [1] which allows a failing job to pick up from where it stopped. Sincere greetings from the supposed close-by Zurich 😊 Thias [1] https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastre