Re: Customize file assignments logic in flink application

2019-08-26 Thread Lu Niu
Yes. you are right. SplittableIterator will cause each worker list all the files. thanks! best Lu On Fri, Aug 16, 2019 at 12:33 AM Zhu Zhu wrote: > Hi Lu, > > I think it's OK to choose any way as long as it works. > Though I've no idea how you would extend SplittableIterator in your case. > The

Re: Customize file assignments logic in flink application

2019-08-16 Thread Zhu Zhu
Hi Lu, I think it's OK to choose any way as long as it works. Though I've no idea how you would extend SplittableIterator in your case. The underlying is ParallelIteratorInputFormat and its processing is not matched to a certain subtask index. Thanks, Zhu Zhu Lu Niu 于2019年8月16日周五 上午12:48写道: >

Re: Customize file assignments logic in flink application

2019-08-15 Thread Lu Niu
Hi, Zhu Thanks for reply! I found using SplittableIterator is also doable to some extent. How to choose between these two? Best Lu On Wed, Aug 14, 2019 at 8:02 PM Zhu Zhu wrote: > Hi Lu, > > Implementing your own *InputFormat* and *InputSplitAssigner*(which has > the interface "InputSplit getN

Re: Customize file assignments logic in flink application

2019-08-14 Thread Zhu Zhu
Hi Lu, Implementing your own *InputFormat* and *InputSplitAssigner*(which has the interface "InputSplit getNextInputSplit(String host, int taskId)") created by it should work if you want to assign InputSplit to tasks according to the task index and file name patterns. To assign 2 *InputSplit*s in

Customize file assignments logic in flink application

2019-08-14 Thread Lu Niu
Hi, I have a data set backed by a directory of files in which file names are meaningful. folder1 +-file01 +-file02 +-file03 +-file04 I want to control the file assignments in my flink application. For example, when parallelism is 2, worker 1 get file01 and file02 to r