zhuzhurk commented on pull request #397: URL: https://github.com/apache/flink-web/pull/397#issuecomment-740559997
> It depends on how you read it. The graph is broken up at the blocking exchanges, so that the remaining regions are connected purely by pipelined exchanges. > […](#) > On Tue, 8 Dec 2020, 10:03 Zhu Zhu, ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In _posts/2020-12-04-release-1.12.0.md <[#397 (comment)](https://github.com/apache/flink-web/pull/397#discussion_r538156425)>: > + +<hr> + +### Other Improvements + +**Migration of existing connectors to the new Data Source API** + +The previous release introduced a new Data Source API ([FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)), allowing to implement connectors that work both as bounded (batch) and unbounded (streaming) sources. In Flink 1.12, the community started porting existing source connectors to the new interfaces, starting with the FileSystem connector ([FLINK-19161](https://issues.apache.org/jira/browse/FLINK-19161)). + +<div class="alert alert-danger small" markdown="1"> +<b>Attention:</b> The unified source implementations will be completely separate connectors that are not snapshot- compatible with their legacy counterparts. +</div> + +**Pipelined Region Scheduling ([FLIP-119](https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling#FLIP119PipelinedRegionScheduling-BulkSlotAllocation))** + +Flink’s scheduler has been largely designed to address batch and streaming workloads separately. This release introduces a **unified** scheduling strategy that identifies sets of tasks connected via blocking data exchanges to break down the execution graph into _pipelined regions_. This allows to schedule each region only when there’s data to perform work and only deploy it once all the required resources are available; as well as to restart failed regions independently. In particular for batch jobs, the new strategy leads to more efficient resource utilization and eliminates deadlocks. Data exchange can be either pipelined or blocking. Pipelined regions are actually pipelined connected regions. Also from the blogpost, there is "A *pipelined region* is a subset of *subtasks* in the *ExecutionGraph* connected by *pipelined <#m_-3661220726912593604_intermediate-results>* data exchanges." — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <[#397 (comment)](https://github.com/apache/flink-web/pull/397#discussion_r538156425)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANFVKQATIN7NDLUZLTDWGTSTXTUBANCNFSM4UK6ZF4A> . If to read it like that, maybe "identifies blocking data exchanges to break down the execution graph" is better? Because "identifies sets of tasks ..." focuses more on the task sets which, I feel, represent regions. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org