Hi,

CollectionInputFormat currently enforces a parallelism of 1 by implementing
NonParallelInput and serializing the entire Collection. If my understanding
is correct this serialized InputFormat is often the cause of a new job
exceeding the akka message size limit.

As an alternative the Collection elements could be serialized into multiple
InputSplits. Has this idea been considered and rejected?

Thanks,
Greg

Reply via email to