Hello,

I am running a batch flink job to read an iceberg table. I want to
understand a few things.

1. How does the FlinkSplitPlanner decide which fileScanTasks (I think one
task corresponds to one data file) need to be clubbed together within a
single split and when to create a new split?

2. When the number of task slots is limited, what is the sequence in which
the splits are assigned to the task slots?
For example,  if there are 4 task slots available but the number of splits
(source parallelism) to be read is 8, which 4 splits will be sent to the
task slots first? Where in the codebase does this logic exist?

Would appreciate any docs, pointers to the codebase that could help me
understand the above.

Thanks
Chetas

Reply via email to