Jiayi Liao created FLINK-19011:
----------------------------------

             Summary: Parallelize the restoreOperation in OperatorStateBackend 
                 Key: FLINK-19011
                 URL: https://issues.apache.org/jira/browse/FLINK-19011
             Project: Flink
          Issue Type: Improvement
    Affects Versions: 1.11.1
            Reporter: Jiayi Liao


To restore the states, union state needs to read state handles produced by all 
operators. And currently during the restore operation, Flink iterates the state 
handles one by one, which could last tens of minutes if the magnitude of state 
handles exceeds ten thousand. 

To accelerate the process, I propose to parallelize the random reads on HDFS 
and deserialization. We can create a runnable for each state handle and let it 
return the metadata and deserialized data, which can be aggregated in main 
thread.





--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to