Yes. you are right. SplittableIterator will cause each worker list all the
files. thanks!
best
Lu
On Fri, Aug 16, 2019 at 12:33 AM Zhu Zhu wrote:
> Hi Lu,
>
> I think it's OK to choose any way as long as it works.
> Though I've no idea how you would extend SplittableIterator in your case.
> The
Hi Lu,
I think it's OK to choose any way as long as it works.
Though I've no idea how you would extend SplittableIterator in your case.
The underlying is ParallelIteratorInputFormat and its processing is not
matched to a certain subtask index.
Thanks,
Zhu Zhu
Lu Niu 于2019年8月16日周五 上午12:48写道:
>
Hi, Zhu
Thanks for reply! I found using SplittableIterator is also doable to some
extent. How to choose between these two?
Best
Lu
On Wed, Aug 14, 2019 at 8:02 PM Zhu Zhu wrote:
> Hi Lu,
>
> Implementing your own *InputFormat* and *InputSplitAssigner*(which has
> the interface "InputSplit getN
Hi Lu,
Implementing your own *InputFormat* and *InputSplitAssigner*(which has the
interface "InputSplit getNextInputSplit(String host, int taskId)") created
by it should work if you want to assign InputSplit to tasks according to
the task index and file name patterns.
To assign 2 *InputSplit*s in
Hi,
I have a data set backed by a directory of files in which file names are
meaningful.
folder1
+-file01
+-file02
+-file03
+-file04
I want to control the file assignments in my flink application. For
example, when parallelism is 2, worker 1 get file01 and file02 to r