This still seems to be broken. In 1.1.1, it errors immediately on this line
(from the above repro script):
liveTweets.map(t => noop(t)).print()
The stack trace is:
org.apache.spark.SparkException: Task not serializable
at
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureClean
Yep, here goes!
Here are my environment vitals:
- Spark 1.0.0
- EC2 cluster with 1 slave spun up using spark-ec2
- twitter4j 3.0.3
- spark-shell called with --jars argument to load
spark-streaming-twitter_2.10-1.0.0.jar as well as all the twitter4j
jars.
Now, while I’m in the S
I am very curious though. Can you post a concise code example which we can
run to reproduce this problem?
TD
On Tue, Jul 15, 2014 at 2:54 PM, Tathagata Das
wrote:
> I am not entire sure off the top of my head. But a possible (usually
> works) workaround is to define the function as a val inste
I am not entire sure off the top of my head. But a possible (usually works)
workaround is to define the function as a val instead of a def. For example
def func(i: Int): Boolean = { true }
can be written as
val func = (i: Int) => { true }
Hope this helps for now.
TD
On Tue, Jul 15, 2014 at 9
Hey Diana,
Did you ever figure this out?
I’m running into the same exception, except in my case the function I’m
calling is a KMeans model.predict().
In regular Spark it works, and Spark Streaming without the call to
model.predict() also works, but when put together I get this serialization
exce
Hey all, trying to set up a pretty simple streaming app and getting some
weird behavior.
First, a non-streaming job that works fine: I'm trying to pull out lines
of a log file that match a regex, for which I've set up a function:
def getRequestDoc(s: String):
String = { "KBDOC-[0-9]*".r.find