For uniform partitioning, you can try custom Partitioner.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-OutOfMemoryError-Requested-array-size-exceeds-VM-limit-tp16809p26477.html
Sent from the Apache Spark User List mailing list archive at Nabble
"only option is to split you problem further by increasing parallelism" My
understanding is by increasing the number of partitions, is that right?
That didn't seem to help because it is seem the partitions are not uniformly
sized. My observation is when I increase the number of partitions, it
c
63 spilling in-memory
> map of 3925 MB to disk (1 time so far)
> 14/10/11 13:05:17 INFO ExternalAppendOnlyMap: Thread 63 spilling in-memory
> map of 3925 MB to disk (2 times so far)
> 14/10/11 13:09:15 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID
> 1566)
> java.lang
ROR Executor: Exception in task 0.0 in stage 0.0
(TID 1566)
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.util.Arrays.copyOf(Arrays.java:2271)
at
java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
at
java.io.ByteArrayO
erIterator$BasicBlockFetcherIterator:
>>> Getting 1566 non-empty blocks out of 1566 blocks
>>> 14/10/11 13:00:16 INFO BlockFetcherIterator$BasicBlockFetcherIterator:
>>> Started 0 remote fetches in 4 ms
>>> 14/10/11 13:02:06 INFO ExternalAppendOnlyMap:
ndOnlyMap: Thread 63 spilling
>> in-memory map of 3925 MB to disk (1 time so far)
>> 14/10/11 13:05:17 INFO ExternalAppendOnlyMap: Thread 63 spilling
>> in-memory map of 3925 MB to disk (2 times so far)
>> 14/10/11 13:09:15 ERROR Executor: Exception in task 0.0 in stage 0.0 (TI
run(Executor.scala:187)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> 14/10/11 13:09:15 ER
10/11 13:09:15 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 1566)
java.lang.OutOfMemoryError: Requested array size exceeds VM limit
at java.util.Arrays.copyOf(Arrays.java:2271)
at java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.
kSetManager: Lost task 0.0 in stage 3.0
(TID 2028, idp11.foo.bar): java.lang.OutOfMemoryError: Requested array size
exceeds VM limit
java.util.Arrays.copyOf(Arrays.java:3230)
java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:113)
java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayO