Zhu Zhu created FLINK-13056: ------------------------------- Summary: Optimize region failover performance on calculating vertices to restart Key: FLINK-13056 URL: https://issues.apache.org/jira/browse/FLINK-13056 Project: Flink Issue Type: Sub-task Components: Runtime / Coordination Affects Versions: 1.9.0 Reporter: Zhu Zhu Assignee: Zhu Zhu
Currently some region boundary structures are calculated each time of a region failover. This calculation can be heavy as its complexity goes up with execution edge count. We tested it in a sample case with 8000 vertices and 16,000,000 edges. It takes ~2.0s to calculate vertices to restart. (more details in [https://docs.google.com/document/d/197Ou-01h2obvxq8viKqg4FnOnsykOEKxk3r5WrVBPuA/edit?usp=sharing)] That's why we'd propose to cache the region boundary structures to improve the region failover performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)