hi experts,
I’m reporting a problem about spark graphx, I use zeppelin submit spark jobs,
note that scala environment shares the same SparkContext, SQLContext instance,
and I call Connected components algorithm to do some Business,
found that every time when the job finished, some graph storag
I see the following code toward the end of the method:
// Unpersist the RDDs hidden by newly-materialized RDDs
oldMessages.unpersist(blocking = false)
prevG.unpersistVertices(blocking = false)
prevG.edges.unpersist(blocking = false)
Wouldn't the above achieve same effect ?
Hi All,
I am trying Spark Sql on a dataset ~16Tb with large number of files (~50K).
Each file is roughly 400-500 Megs.
I am issuing a fairly simple hive query on the dataset with just filters
(No groupBy's and Joins) and the job is very very slow. It runs for 7-8 hrs
and processes about 80-100 Gig
We are running unit test on our own Jenkins Server, but we encounter some
problems about that, so could someone give me a detail list of the
configurations about Jenkins Server? thx.
--
View this message in context:
http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-Jenkins-test-co
Hi Yash,
How about checking the executor(yarn container) log? Most of time, it shows
more details, we are using CDH, the log is at:
[yucai@sr483 container_1457699919227_0094_01_14]$ pwd
/mnt/DP_disk1/yucai/yarn/logs/application_1457699919227_0094/container_1457699919227_0094_01_14
[yucai
Hi Yucai,
Thanks for the info. I have explored the container logs but did not get lot
of information from it.
I have seen this error log for few containers but not sure of the cause for
it.
1. java.lang.NullPointerException (DiskBlockManager.scala:167)
2. java.lang.ClassCastException: RegisterExec
It is possible not the first failure, could you increase below and rerun?
spark.yarn.executor.memoryOverhead 4096
In my experience, sometimes, netty will use lots of off-heap memory, which may
lead to exceed container memory limitation and be killed by yarn’s node manager.
Thanks,
Yuca
thanks ted for replying ,
these three lines can’t release param graph cache, it only release g (
graph.mapVertices((vid, vdata) => vprog(vid, vdata, initialMsg)).cache() )
ConnectedComponents.scala param graph will cache in ccGraph and won’t be
release in Pregel
def run[VD: ClassTag, ED: ClassT