[ https://issues.apache.org/jira/browse/FLINK-13247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flink Jira Bot updated FLINK-13247: ----------------------------------- Labels: auto-deprioritized-major auto-deprioritized-minor (was: auto-deprioritized-major stale-minor) Priority: Not a Priority (was: Minor) This issue was labeled "stale-minor" 7 days ago and has not received any updates so it is being deprioritized. If this ticket is actually Minor, please raise the priority and ask a committer to assign you the issue or revive the public discussion. > Implement external shuffle service for YARN > ------------------------------------------- > > Key: FLINK-13247 > URL: https://issues.apache.org/jira/browse/FLINK-13247 > Project: Flink > Issue Type: New Feature > Components: Runtime / Network > Reporter: MalcolmSanders > Priority: Not a Priority > Labels: auto-deprioritized-major, auto-deprioritized-minor > > Flink batch job users could achieve better cluster utilization and job > throughput throught external shuffle service because the producers of > intermedia result partitions can be released once intermedia result > partitions have been persisted on disks. In > [FLINK-10653|https://issues.apache.org/jira/browse/FLINK-10653], [~zjwang] > has introduced pluggable shuffle manager architecture which abstracts the > process of data transfer between stages from flink runtime as shuffle > service. I propose to YARN implementation for flink external shuffle service > since YARN is widely used in various companies. > The basic idea is as follows: > (1) Producers write intermedia result partitions to local disks assigned by > NodeManager; > (2) Yarn shuffle servers, deployed on each NodeManager as an auxiliary > service, are acknowledged of intermedia result partition descriptions by > producers; > (3) Consumers fetch intermedia result partition from yarn shuffle servers; -- This message was sent by Atlassian Jira (v8.20.10#820010)