Yup, that was the problem.
Changing the default " mongo.input.split_size" from 8MB to 100MB did the
trick.
Config reference:
https://github.com/mongodb/mongo-hadoop/wiki/Configuration-Reference
Thanks!
On Sat, Sep 12, 2015 at 3:15 PM, Richard Eggert
wrote:
> Hmm... The count() method invokes t
Hmm... The count() method invokes this:
def runJob[T, U: ClassTag](rdd: RDD[T], func: Iterator[T] => U): Array[U] =
{
runJob(rdd, func, 0 until rdd.partitions.length)
}
It appears that you're running out of memory while trying to compute
(within the driver) the number of partitions that will b