Re: problem about broadcast variable in iteration

2014-05-29 Thread randylu
hi, Andrew Ash, thanks for your reply. In fact, I have already used unpersist(), but it doesn't take effect. One reason that I select 1.0.0 version is just for it providing unpersist() interface. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/problem-about-

Re: problem about broadcast variable in iteration

2014-05-25 Thread Andrew Ash
Hi Randy, In Spark 1.0 there was a lot of work done to allow unpersisting data that's no longer needed. See the below pull request. Try running kvGlobal.unpersist() on line 11 before the re-broadcast of the next variable to see if you can cut the dependency there. https://github.com/apache/spar

Re: problem about broadcast variable in iteration

2014-05-15 Thread randylu
But when i put broadcast variable out of for-circle, it workes well(if not concerned about memory issue as you pointed out): 1 var rdd1 = ... 2 var rdd2 = ... 3 var kv = ... 4 var kvGlobal = sc.broadcast(kv) // broadcast kv 5 for (i <- 0 until n) { 6rdd1 = rdd2.ma

Re: problem about broadcast variable in iteration

2014-05-15 Thread randylu
rdd1 is cached, but it has no effect: 1 var rdd1 = ... 2 var rdd2 = ... 3 var kv = ... 4 for (i <- 0 until n) { 5var kvGlobal = sc.broadcast(kv) // broadcast kv 6rdd1 = rdd2.map { 7 case t => doSomething(t, kvGlobal.value) 8}.cache() 9var tmp

Re: problem about broadcast variable in iteration

2014-05-15 Thread Earthson
RDD is not cached? Because recomputing may be required, every broadcast object is included in the dependences of RDDs, this may also have memory issue(when n and kv is too large in your case). -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/problem-about-b

Re: problem about broadcast variable in iteration

2014-05-10 Thread randylu
i run in spark 1.0.0, the newest under-development version. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/problem-about-broadcast-variable-in-iteration-tp5479p5480.html Sent from the Apache Spark User List mailing list archive at Nabble.com.