Here, the Main object is not meant to be serialized. transient ought
to be for fields that are within an object that is legitimately
supposed to be serialized, but, whose value can be recreated on
deserialization. I feel like marking objects that aren't logically
Serializable as such is a hack, and
As Mohit said, making Main extend Serializable should fix this example. In
general, it's not a bad idea to mark the fields you don't want to serialize
(e.g., sc and conf in this case) as @transient as well, though this is not
the issue in this case.
Note that this problem would not have arisen in
perhaps the closure ends up including the "main" object which is not
defined as serializable...try making it a "case object" or "object main
extends Serializable".
On Sat, Nov 22, 2014 at 4:16 PM, lordjoe wrote:
> I posted several examples in java at http://lordjoesoftware.blogspot.com/
>
> Gene
I posted several examples in java at http://lordjoesoftware.blogspot.com/
Generally code like this works and I show how to accumulate more complex
values.
// Make two accumulators using Statistics
final Accumulator totalLetters= ctx.accumulator(0L,
"ttl");
JavaRDD lines = ..
Hi Sowen,
You're right, that example works, but look what example does not work for
me:
object Main {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("name")
val sc = new SparkContext(conf)
val accum = sc.accumulator(0)
for (i <- 1 to 10) {
va
That seems to work fine. Add to your example
def foo(i: Int, a: Accumulator[Int]) = a += i
and add an action at the end to get the expression to evaluate:
sc.parallelize(Array(1, 2, 3, 4)).map(x => foo(x,accum)).foreach(println)
and it works, and you have accum with value 10 at the end.
The si
One month later, the same problem. I think that someone (e.g. inventors of
Spark) should show us a big example of how to use accumulators. I can start
telling that we need to see an example of the following form:
val accum = sc.accumulator(0)
sc.parallelize(Array(1, 2, 3, 4)).map(x => foo(x,accum)
I'm running into similar problems with accumulators failing to serialize
properly. Are there any examples of accumulators being used in more
complex environments than simply initializing them in the same class and
then using them in a .foreach() on an RDD referenced a few lines below?
>From the a
This may be due in part to Scala allocating an anonymous inner class in
order to execute the for loop. I would expect if you change it to a while
loop like
var i = 0
while (i < 10) {
sc.parallelize(Array(1, 2, 3, 4)).foreach(x => accum += x)
i += 1
}
then the problem may go away. I am not sup
Could you provide all pieces of codes which can reproduce the bug? Here is
my test code:
import org.apache.spark._
import org.apache.spark.SparkContext._
object SimpleApp {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("SimpleApp")
val sc = new SparkContext(conf
I am also using spark 1.1.0 and I ran it on a cluster of nodes (it works if I
run it in local mode! )
If I put the accumulator inside the for loop, everything will work fine. I
guess the bug is that an accumulator can be applied to JUST one RDD.
Still another undocumented 'feature' of Spark tha
Sorry, I forgot to say that this gives the above error just when run on a
cluster, not in local mode.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Bug-in-Accumulators-tp17263p17277.html
Sent from the Apache Spark User List mailing list archive at Nabble.c
works fine. Spark 1.1.0 on REPL
On Sat, Oct 25, 2014 at 1:41 PM, octavian.ganea
wrote:
> There is for sure a bug in the Accumulators code.
>
> More specifically, the following code works well as expected:
>
> def main(args: Array[String]) {
> val conf = new SparkConf().setAppName("EL LBP SP
13 matches
Mail list logo