00
> >> > executors's memory in SparkSQL, on which we would do some calculation
> >> > using
> >> > UDFs in pyspark.
> >> > If I run my SQL on only a portion of the data (filtering by one of the
> >> > attributes), let's say 800 million records,
ARD_ACCOUNT_CITY_SRC,
>> > STANDARD_ACCOUNT_CITY_SRC)
>> > /
>> > CASE WHEN LENGTH (STANDARD_ACCOUNT_CITY_SRC)>LENGTH
>> > (STANDARD_ACCOUNT_CITY_SRC)
>> > THEN LENGTH (STANDARD_ACCOUNT_CITY_SRC)
let's say 800 million records, then all works well. But
> when I
> > run the same SQL on all the data, then I receive
> > "java.lang.OutOfMemoryError: GC overhead limit exceeded" from basically
> all
> > of the executors.
> >
> > It seems to me
on which we would do some calculation using
> UDFs in pyspark.
> If I run my SQL on only a portion of the data (filtering by one of the
> attributes), let's say 800 million records, then all works well. But when I
> run the same SQL on all the data, then I receive
> "jav
nly a portion of the data (filtering by one of the
attributes), let's say 800 million records, then all works well. But when I
run the same SQL on all the data, then I receive "*java.lang.OutOfMemoryError:
GC overhead limit exceeded"* from basically all of the executors.
It seems to me
Spark tasks
>> are done.
>>
>> After the spark tasks are done, the job appears to be running for over an
>> hour, until I get the following (full stack trace below):
>>
>> java.lang.OutOfMemoryError: GC overhead limit exceeded
>> at
>>
dataset successfully. I can see the output in
HDFS once all Spark tasks are done.
After the spark tasks are done, the job appears to be running for
over an hour, until I get the following (full stack trace below):
java.lang.OutOfMemoryError: GC overhead
; successfully. I can see the output in HDFS once all Spark tasks are done.
>
> After the spark tasks are done, the job appears to be running for over an
> hour, until I get the following (full stack trace below):
>
>
ull stack trace below):
>
> java.lang.OutOfMemoryError: GC overhead limit exceeded
> at
> org.apache.parquet.format.converter.ParquetMetadataConverter.toParquetStatistics(ParquetMetadataConverter.java:238)
>
> I had set the driver memory to be 20GB.
>
> I attempted to read in
hour, until I get the following (full stack trace below):
java.lang.OutOfMemoryError: GC overhead limit exceeded
at
org.apache.parquet.format.converter.ParquetMetadataConverter.toParquetStatistics(ParquetMetadataConverter.java:238)
I had set the driver memory to be 20GB.
I attempted to
'An error occurred while calling {0}{1}{2}.\n'.
--> 300 format(target_id, '.', name), value)
301 else:
302 raise Py4JError(
Py4JJavaError: An error occurred while calling o65.partitions.
: java.lang.OutOfMemoryError: GC o
val Patel [mailto:dhaval1...@gmail.com]
Sent: Saturday, November 7, 2015 12:26 AM
To: Spark User Group
Subject: [sparkR] Any insight on java.lang.OutOfMemoryError: GC overhead limit
exceeded
I have been struggling through this error since past 3 days and have tried all
possible ways/suggestion
cast_2_piece0 on
localhost:39562 in memory (size: 2.4 KB, free: 530.0 MB)
15/11/06 10:45:20 INFO ContextCleaner: Cleaned accumulator 2
15/11/06 10:45:53 WARN ServletHandler: Error for /static/timeline-view.css
java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.util.zip.Zip
1.2.0 is quite old.
You may want to try 1.5.1 which was released in the past week.
Cheers
> On Oct 4, 2015, at 4:26 AM, t_ras wrote:
>
> I get java.lang.OutOfMemoryError: GC overhead limit exceeded when trying
> coutn action on a file.
>
> The file is a CSV file 217GB zi
I get java.lang.OutOfMemoryError: GC overhead limit exceeded when trying
coutn action on a file.
The file is a CSV file 217GB zise
Im using a 10 r3.8xlarge(ubuntu) machines cdh 5.3.6 and spark 1.2.0
configutation:
spark.app.id:local-1443956477103
spark.app.name:Spark shell
spark.cores.max
endet: Samstag, 11. Juli 2015 03:58
An: Ted Yu; Robin East; user
Betreff: Re: Spark GraphX memory requirements + java.lang.OutOfMemoryError: GC
overhead limit exceeded
Hello again.
So I could compute triangle numbers when run the code from spark shell without
workers (with --driver-memory 15g o
html and
>> in the reduce phase we keep the html that has the shortest URL. However,
>> after running for 2-3 hours the application crashes due to memory issue.
>> Here is the exception:
>>
>> 15/07/15 18:24:05 WARN scheduler.TaskSetManager: Lost task 267.0 in stage
-1 signature from the html and in
> the reduce phase we keep the html that has the shortest URL. However, after
> running for 2-3 hours the application crashes due to memory issue. Here is
> the exception:
>
> 15/07/15 18:24:05 WARN scheduler.TaskSetManager: Lost task 267.0 in stag
ep the html that has the shortest URL.
However, after running for 2-3 hours the application crashes due to
memory issue. Here is the exception:
15/07/15 18:24:05 WARN scheduler.TaskSetManager: Lost task 267.0 in
stage 0.0 (TID 267, psh-11.nse.ir): java.lang.OutOfMemoryError: GC
overhead limit exceeded
Hello again.
So I could compute triangle numbers when run the code from spark shell
without workers (with --driver-memory 15g option), but with workers I have
errors. So I run spark shell:
./bin/spark-shell --master spark://192.168.0.31:7077 --executor-memory
6900m --driver-memory 15g
and workers (
Yep, I already found it. So I added 1 line:
val graph = GraphLoader.edgeListFile(sc, "", ...)
val newgraph = graph.convertToCanonicalEdges()
and could successfully count triangles on "newgraph". Next will test it on
bigger (several Gb) networks.
I am using Spark 1.3 and 1.4 but haven't seen
See SPARK-4917 which went into Spark 1.3.0
On Fri, Jun 26, 2015 at 2:27 AM, Robin East wrote:
> You’ll get this issue if you just take the first 2000 lines of that file.
> The problem is triangleCount() expects srdId < dstId which is not the case
> in the file (e.g. vertex 28). You can get round
You’ll get this issue if you just take the first 2000 lines of that file. The
problem is triangleCount() expects srdId < dstId which is not the case in the
file (e.g. vertex 28). You can get round this by calling
graph.convertToCanonical Edges() which removes bi-directional edges and ensures
sr
Ok, but what does it means? I did not change the core files of spark, so is
it a bug there?
PS: on small datasets (<500 Mb) I have no problem.
Am 25.06.2015 18:02 schrieb "Ted Yu" :
> The assertion failure from TriangleCount.scala corresponds with the
> following lines:
>
> g.outerJoinVertices
The assertion failure from TriangleCount.scala corresponds with the
following lines:
g.outerJoinVertices(counters) {
(vid, _, optCounter: Option[Int]) =>
val dblCount = optCounter.getOrElse(0)
// double count should be even (divisible by two)
assert((dblCount & 1)
Hello!
I am trying to compute number of triangles with GraphX. But get memory
error or heap size, even though the dataset is very small (1Gb). I run the
code in spark-shell, having 16Gb RAM machine (also tried with 2 workers on
separate machines 8Gb RAM each). So I have 15x more memory than the dat
ve a Spark job that throws "java.lang.OutOfMemoryError: GC overhead
> limit exceeded".
>
> The job is trying to process a filesize 4.5G.
>
> I've tried following spark configuration:
>
> --num-executors 6 --executor-memory 6G --executor-cores 6 --driver-memory 3G
>
> I tried i
Hello All,
I have a Spark job that throws "java.lang.OutOfMemoryError: GC overhead
limit exceeded".
The job is trying to process a filesize 4.5G.
I've tried following spark configuration:
--num-executors 6 --executor-memory 6G --executor-cores 6 --driver-memory 3G
I tried
bject: Re: java.lang.OutOfMemoryError: GC overhead limit exceeded
I have yarn configured with yarn.nodemanager.vmem-check-enabled=false and
yarn.nodemanager.pmem-check-enabled=false to avoid yarn killing the
containers.
the stack trace is bellow.
thanks,
Antony.
15/01/27 17:0
17:02:53 ERROR executor.Executor: Exception in task 21.0 in
stage 12.0 (TID 1312)java.lang.OutOfMemoryError: GC overhead limit exceeded
at java.lang.Integer.valueOf(Integer.java:642) at
scala.runtime.BoxesRunTime.boxToInteger(BoxesRunTime.java:70) at
Can you attach the logs where this is failing?
From: Sven Krasser
Date: Tuesday, January 27, 2015 at 4:50 PM
To: Guru Medasani
Cc: Sandy Ryza , Antony Mayi
, "user@spark.apache.org"
Subject: Re: java.lang.OutOfMemoryError: GC overhead limit exceeded
Since it's an executor
e: Tuesday, January 27, 2015 at 3:33 PM
> To: Antony Mayi
> Cc: "user@spark.apache.org"
> Subject: Re: java.lang.OutOfMemoryError: GC overhead limit exceeded
>
> Hi Antony,
>
> If you look in the YARN NodeManager logs, do you see that it's killing the
> exec
: Tuesday, January 27, 2015 at 3:33 PM
To: Antony Mayi
Cc: "user@spark.apache.org"
Subject: Re: java.lang.OutOfMemoryError: GC overhead limit exceeded
Hi Antony,
If you look in the YARN NodeManager logs, do you see that it's killing the
executors? Or are they crashing for a d
Hi Antony,
If you look in the YARN NodeManager logs, do you see that it's killing the
executors? Or are they crashing for a different reason?
-Sandy
On Tue, Jan 27, 2015 at 12:43 PM, Antony Mayi
wrote:
> Hi,
>
> I am using spark.yarn.executor.memoryOverhead=8192 yet getting executors
> crashe
Hi,
I am using spark.yarn.executor.memoryOverhead=8192 yet getting executors
crashed with this error.
does that mean I have genuinely not enough RAM or is this matter of config
tuning?
other config options used:spark.storage.memoryFraction=0.3
SPARK_EXECUTOR_MEMORY=14G
running spark 1.2.0 as yarn
Hi guys,
My Spark Streaming application have this "java.lang.OutOfMemoryError: GC
overhead limit exceeded" error in SparkStreaming driver program. I have
done the following to debug with it:
1. improved the driver memory from 1GB to 2GB, this error came after 22
hrs. When the memory w
er and attempting to run a simple spark app that
> processes about 10-15GB raw data but I keep running into this error:
>
> java.lang.OutOfMemoryError: GC overhead limit exceeded
>
> Each node has 8 cores and 2GB memory. I notice the heap size on the
> executors is set to 512MB wi
I got a 40 node cdh 5.1 cluster and attempting to run a simple spark app that
processes about 10-15GB raw data but I keep running into this error:
java.lang.OutOfMemoryError: GC overhead limit exceeded
Each node has 8 cores and 2GB memory. I notice the heap size on the
executors is set to
Thanks, Abel.
Best,
Yifan LI
On Jul 21, 2014, at 4:16 PM, Abel Coronado Iruegas
wrote:
> Hi Yifan
>
> This works for me:
>
> export SPARK_JAVA_OPTS="-Xms10g -Xmx40g -XX:MaxPermSize=10g"
> export ADD_JARS=/home/abel/spark/MLI/target/MLI-assembly-1.0.jar
> export SPARK_MEM=40g
> ./spark-shell
Hi Yifan
This works for me:
export SPARK_JAVA_OPTS="-Xms10g -Xmx40g -XX:MaxPermSize=10g"
export ADD_JARS=/home/abel/spark/MLI/target/MLI-assembly-1.0.jar
export SPARK_MEM=40g
./spark-shell
Regards
On Mon, Jul 21, 2014 at 7:48 AM, Yifan LI wrote:
> Hi,
>
> I am trying to load the Graphx examp
Hi,
I am trying to load the Graphx example dataset(LiveJournal, 1.08GB) through
Scala Shell on my standalone multicore machine(8 cpus, 16GB mem), but an
OutOfMemory error was returned when below code was running,
val graph = GraphLoader.edgeListFile(sc, path, minEdgePartitions =
16).partitionB
e objects similar to MapReduce
>> (HadoopRDD does this by actually using Hadoop's Writables, for instance),
>> but the general Spark APIs don't support this because mutable objects are
>> not friendly to caching or serializing.
>>
>>
>> On Tue, Jul 8, 201
>
> On Tue, Jul 8, 2014 at 9:27 AM, Konstantin Kudryavtsev <
> kudryavtsev.konstan...@gmail.com> wrote:
>
>> Hi all,
>>
>> I faced with the next exception during map step:
>> java.lang.OutOfMemoryError (java.lang.O
erializing.
On Tue, Jul 8, 2014 at 9:27 AM, Konstantin Kudryavtsev <
kudryavtsev.konstan...@gmail.com> wrote:
> Hi all,
>
> I faced with the next exception during map step:
> java.lang.OutOfMemoryError (java.lang.OutOfMemoryError: GC overhead limit
> exceeded)
> java.lang.re
Hi all,
I faced with the next exception during map step:
java.lang.OutOfMemoryError (java.lang.OutOfMemoryError: GC overhead limit
exceeded)
java.lang.reflect.Array.newInstance(Array.java:70)
com.esotericsoftware.kryo.serializers.DefaultArraySerializers$ObjectArraySerializer.read
45 matches
Mail list logo