spark graphx storage RDD memory leak

2016-04-10 Thread zhang juntao
hi experts, I’m reporting a problem about spark graphx, I use zeppelin submit spark jobs, note that scala environment shares the same SparkContext, SQLContext instance, and I call Connected components algorithm to do some Business, found that every time when the job finished, some graph storag

Re: spark graphx storage RDD memory leak

2016-04-10 Thread Ted Yu
I see the following code toward the end of the method: // Unpersist the RDDs hidden by newly-materialized RDDs oldMessages.unpersist(blocking = false) prevG.unpersistVertices(blocking = false) prevG.edges.unpersist(blocking = false) Wouldn't the above achieve same effect ?

Spark Sql on large number of files (~500Megs each) fails after couple of hours

2016-04-10 Thread Yash Sharma
Hi All, I am trying Spark Sql on a dataset ~16Tb with large number of files (~50K). Each file is roughly 400-500 Megs. I am issuing a fairly simple hive query on the dataset with just filters (No groupBy's and Joins) and the job is very very slow. It runs for 7-8 hrs and processes about 80-100 Gig

Spark Jenkins test configurations

2016-04-10 Thread cherry_zhang
We are running unit test on our own Jenkins Server, but we encounter some problems about that, so could someone give me a detail list of the configurations about Jenkins Server? thx. -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-Jenkins-test-co

RE: Spark Sql on large number of files (~500Megs each) fails after couple of hours

2016-04-10 Thread Yu, Yucai
Hi Yash, How about checking the executor(yarn container) log? Most of time, it shows more details, we are using CDH, the log is at: [yucai@sr483 container_1457699919227_0094_01_14]$ pwd /mnt/DP_disk1/yucai/yarn/logs/application_1457699919227_0094/container_1457699919227_0094_01_14 [yucai

Re: Spark Sql on large number of files (~500Megs each) fails after couple of hours

2016-04-10 Thread Yash Sharma
Hi Yucai, Thanks for the info. I have explored the container logs but did not get lot of information from it. I have seen this error log for few containers but not sure of the cause for it. 1. java.lang.NullPointerException (DiskBlockManager.scala:167) 2. java.lang.ClassCastException: RegisterExec

RE: Spark Sql on large number of files (~500Megs each) fails after couple of hours

2016-04-10 Thread Yu, Yucai
It is possible not the first failure, could you increase below and rerun? spark.yarn.executor.memoryOverhead 4096 In my experience, sometimes, netty will use lots of off-heap memory, which may lead to exceed container memory limitation and be killed by yarn’s node manager. Thanks, Yuca

Fwd: spark graphx storage RDD memory leak

2016-04-10 Thread zhang juntao
thanks ted for replying , these three lines can’t release param graph cache, it only release g ( graph.mapVertices((vid, vdata) => vprog(vid, vdata, initialMsg)).cache() ) ConnectedComponents.scala param graph will cache in ccGraph and won’t be release in Pregel def run[VD: ClassTag, ED: ClassT