Hi Lu, Implementing your own *InputFormat* and *InputSplitAssigner*(which has the interface "InputSplit getNextInputSplit(String host, int taskId)") created by it should work if you want to assign InputSplit to tasks according to the task index and file name patterns. To assign 2 *InputSplit*s in one request, you can implement a new *InputSplit* which wraps multiple *FileInputSplit*s. And you may need to define in your *InputFormat* on how to process the new *InputSplit*.
Thanks, Zhu Zhu Lu Niu <qqib...@gmail.com> 于2019年8月15日周四 上午12:26写道: > Hi, > > I have a data set backed by a directory of files in which file names are > meaningful. > > folder1 > +-----file01 > +-----file02 > +-----file03 > +-----file04 > > I want to control the file assignments in my flink application. For > example, when parallelism is 2, worker 1 get file01 and file02 to read and > worker2 get 3 and 4. Also each worker get 2 files all at once because > reading requires jumping back and forth between those two files. > > What's the best way to do this? It seems like FileInputFormat is not > extensible in this case. > > Best > Lu > > >