KevinyhZou created FLINK-25335:
----------------------------------

             Summary: Improvoment of task deployment by enable source split 
async enumerate
                 Key: FLINK-25335
                 URL: https://issues.apache.org/jira/browse/FLINK-25335
             Project: Flink
          Issue Type: Sub-task
          Components: Runtime / Coordination
    Affects Versions: 1.12.1
            Reporter: KevinyhZou


When submit olap query by flink client to Flink Session Cluster, the JobMaster 
will start scheduling and  enumerate the hive source split by 
`HiveSourceFileEnumerator`, and then deploy the query task and execute it. if 
the source
table has a lot of partition and the partition file is big, the source split 
enumerate will cost a lot of time, which would block the task deployment & 
execution for a long time, and the dashboard can not appear

 

JobMaster should async enumerate the hive split, and meanwhile deploy the query 
task and execute it. when the deployment is finished, source operator fetch 
split and read data, and the split enumeration is also going on.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to