zhuzhurk commented on pull request #397:
URL: https://github.com/apache/flink-web/pull/397#issuecomment-740559997


   > It depends on how you read it. The graph is broken up at the blocking 
exchanges, so that the remaining regions are connected purely by pipelined 
exchanges.
   > […](#)
   > On Tue, 8 Dec 2020, 10:03 Zhu Zhu, ***@***.***> wrote: ***@***.**** 
commented on this pull request. ------------------------------ In 
_posts/2020-12-04-release-1.12.0.md <[#397 
(comment)](https://github.com/apache/flink-web/pull/397#discussion_r538156425)>:
 > + +<hr> + +### Other Improvements + +**Migration of existing connectors to 
the new Data Source API** + +The previous release introduced a new Data Source 
API 
([FLIP-27](https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface)),
 allowing to implement connectors that work both as bounded (batch) and 
unbounded (streaming) sources. In Flink 1.12, the community started porting 
existing source connectors to the new interfaces, starting with the FileSystem 
connector ([FLINK-19161](https://issues.apache.org/jira/browse/FLINK-19161)). + 
+<div class="alert alert-danger small" markdown="1"> +<b>Attention:</b> The 
unified source implementations will be completely separate connectors that are 
not snapshot-
 compatible with their legacy counterparts. +</div> + +**Pipelined Region 
Scheduling 
([FLIP-119](https://cwiki.apache.org/confluence/display/FLINK/FLIP-119+Pipelined+Region+Scheduling#FLIP119PipelinedRegionScheduling-BulkSlotAllocation))**
 + +Flink’s scheduler has been largely designed to address batch and streaming 
workloads separately. This release introduces a **unified** scheduling strategy 
that identifies sets of tasks connected via blocking data exchanges to break 
down the execution graph into _pipelined regions_. This allows to schedule each 
region only when there’s data to perform work and only deploy it once all the 
required resources are available; as well as to restart failed regions 
independently. In particular for batch jobs, the new strategy leads to more 
efficient resource utilization and eliminates deadlocks. Data exchange can be 
either pipelined or blocking. Pipelined regions are actually pipelined 
connected regions. Also from the blogpost, there is "A *pipelined
  region* is a subset of *subtasks* in the *ExecutionGraph* connected by 
*pipelined <#m_-3661220726912593604_intermediate-results>* data exchanges." — 
You are receiving this because you were mentioned. Reply to this email 
directly, view it on GitHub <[#397 
(comment)](https://github.com/apache/flink-web/pull/397#discussion_r538156425)>,
 or unsubscribe 
<https://github.com/notifications/unsubscribe-auth/AANFVKQATIN7NDLUZLTDWGTSTXTUBANCNFSM4UK6ZF4A>
 .
   
   If to read it like that, maybe "identifies blocking data exchanges to break 
down the execution graph" is better? Because "identifies sets of tasks ..." 
focuses more on the task sets which, I feel, represent regions.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to