We made some changes in code (it generates 1000 * 1000 elements) and memory
limits up to 100M:
def generate = {
for{
j <- 1 to 10
i <- 1 to 1000
} yield(j, i)
}
~/soft/spark-1.1.0-bin-hadoop2.3/bin/spark-submit --master local
--executor-memory 100M --driver-memory 100M --class Spill -
10M is tiny compared to all of the overhead of running a lot complex Scala
based app in a JVM. I think you may be bumping up against practical minimum
sizes and that you may find it is not really the data size? I don't think
it really scales down this far.
On Nov 22, 2014 2:24 PM, "rzykov" wrote:
Dear all,
Unfortunately I've not got ant respond in users forum. That's why I decided
to publish this question here.
We encountered problems of failed jobs with huge amount of data. For
example, an application works perfectly with relative small sized data, but
when it grows in 2 times this appl