Hello,
I have a few spark jobs that are doing the same aggregations. I want to
factorize the aggregation logic. For that I want to use a Trait.
When I run this job extending my Trait (over yarn, in client mode), I get
a NotSerializableException (in attachment).
If I change my Trait to an Object
Hi All!
I'm using Spark 1.6.1 and I'm trying to transform my DStream as follows:
myStream.transorm { rdd =>
val sqlContext = SQLContext.getOrCreate(rdd.sparkContext)
import sqlContext.implicits._
val j = rdd.toDS()
j.map {
case a => Some(...)
case _ =
> If you reply to this email, your message will be added to the discussion
> below:
>
> http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-NotSerializableException-Methods-Closures-tp26672p26718.html
> To unsubscribe from Spark Streaming - NotSerializabl
The class declaration is already marked Serializable ("with Serializable")
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-NotSerializableException-Methods-Closures-tp26672p26718.html
Sent from the Apache Spark User List mailing li
you can declare you class serializable, as spark would want to serialise the
whole class.
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-NotSerializableException-Methods-Closures-tp26672p26689.html
Sent from the Apache Spark User List
>
>> at
>> org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40)
>> at
>> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47)
>> at
>> org.apache.spar
cala:47)
> at
>
> org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:84)
> at
>
> org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:301)
> ... 20 more
>
>
>
> --
> View this message
he-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-NotSerializableException-Methods-Closures-tp26672.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@
ogger.error("Failed to send
>>> pageview", e)
>>> }
>>> }
>>> }
>>>
>>> argonaut relys on a user implementation of a trait called
>>> `EncodeJson[T]`,
>>> which tells argonaut how to serialize and de
gt;> pageview", e)
>> }
>> }
>> }
>>
>> argonaut relys on a user implementation of a trait called `EncodeJson[T]`,
>> which tells argonaut how to serialize and deserialize the object.
>>
>> The problem is, that the trait `Encod
how to serialize and deserialize the object.
>
> The problem is, that the trait `EncodeJson[T]` is not serializable, thus
> throwing a NotSerializableException:
>
> Caused by: java.io.NotSerializableException: argonaut.EncodeJson$$anon$2
> Serialization stack:
> - ob
eJson[T]` is not serializable, thus
throwing a NotSerializableException:
Caused by: java.io.NotSerializableException: argonaut.EncodeJson$$anon$2
Serialization stack:
- object not serializable (class: argonaut.EncodeJson$$anon$2,
value: argonaut.EncodeJson$$anon$2@6415f61e)
This is obv
I also hit this bug, have you resolved this issue? Or could you give some
suggestions?
2014-07-28 18:33 GMT+08:00 Aniket Bhatnagar :
> I am trying to serialize objects contained in RDDs using runtime
> relfection via TypeTag. However, the Spark job keeps
> failing java.io.NotSerializableException
t to the executors. But I get a NotSerializableException for
> org.xml.sax.helpers.LocatorImpl which is used inside jpmml.
>
> I have this class Prediction.java:
> public class Prediction implements Serializable {
> private RegressionModelEvaluator rme;
>
> public Pred
I am trying to load a PMML file in a spark job. Instantiate it only once
and pass it to the executors. But I get a NotSerializableException for
org.xml.sax.helpers.LocatorImpl which is used inside jpmml.
I have this class Prediction.java:
public class Prediction implements Serializable
Ah, cool. Thanks.
On Wed, Jul 15, 2015 at 5:58 PM, Tathagata Das wrote:
> Spark 1.4.1 just got released! So just download that. Yay for timing.
>
> On Wed, Jul 15, 2015 at 2:47 PM, Ted Yu wrote:
>
>> Should be this one:
>> [SPARK-7180] [SPARK-8090] [SPARK-8091] Fix a number of
>> Serializat
Spark 1.4.1 just got released! So just download that. Yay for timing.
On Wed, Jul 15, 2015 at 2:47 PM, Ted Yu wrote:
> Should be this one:
> [SPARK-7180] [SPARK-8090] [SPARK-8091] Fix a number of
> SerializationDebugger bugs and limitations
> ...
> Closes #6625 from tdas/SPARK-7180 and s
Should be this one:
[SPARK-7180] [SPARK-8090] [SPARK-8091] Fix a number of
SerializationDebugger bugs and limitations
...
Closes #6625 from tdas/SPARK-7180 and squashes the following commits:
On Wed, Jul 15, 2015 at 2:37 PM, Chen Song wrote:
> Thanks
>
> Can you point me to the patch to
Thanks
Can you point me to the patch to fix the serialization stack? Maybe I can
pull it in and rerun my job.
Chen
On Wed, Jul 15, 2015 at 4:40 PM, Tathagata Das wrote:
> Your streaming job may have been seemingly running ok, but the DStream
> checkpointing must have been failing in the backgr
Your streaming job may have been seemingly running ok, but the DStream
checkpointing must have been failing in the background. It would have been
visible in the log4j logs. In 1.4.0, we enabled fast-failure for that so
that checkpointing failures dont get hidden in the background.
The fact that th
Can you show us your function(s) ?
Thanks
On Wed, Jul 15, 2015 at 12:46 PM, Chen Song wrote:
> The streaming job has been running ok in 1.2 and 1.3. After I upgraded to
> 1.4, I started seeing error as below. It appears that it fails in validate
> method in StreamingContext. Is there anything c
The streaming job has been running ok in 1.2 and 1.3. After I upgraded to
1.4, I started seeing error as below. It appears that it fails in validate
method in StreamingContext. Is there anything changed on 1.4.0 w.r.t
DStream checkpointint?
Detailed error from driver:
15/07/15 18:00:39 ERROR yarn
al Message-
> From: dgoldenberg [mailto:dgoldenberg...@gmail.com]
> Sent: Wednesday, February 18, 2015 1:54 PM
> To: user@spark.apache.org
> Subject: NotSerializableException:
> org.apache.http.impl.client.DefaultHttpClient when trying to send documents
> to Solr
>
> I
, please contact the sender and destroy any copies of this information.
-Original Message-
From: dgoldenberg [mailto:dgoldenberg...@gmail.com]
Sent: Wednesday, February 18, 2015 1:54 PM
To: user@spark.apache.org
Subject: NotSerializableException:
org.apache.http.impl.client.DefaultHttpClient
I'm using Solrj in a Spark program. When I try to send the docs to Solr, I
get the NotSerializableException on the DefaultHttpClient. Is there a
possible fix or workaround?
I'm using Spark 1.2.1 with Hadoop 2.4, SolrJ is version 4.0.0.
final HttpSolrServer solrServer = new Http
This still seems to be broken. In 1.1.1, it errors immediately on this line
(from the above repro script):
liveTweets.map(t => noop(t)).print()
The stack trace is:
org.apache.spark.SparkException: Task not serializable
at
org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureClean
[Edge] = {
...
}
And it takes an implicit argument of type GraphQueryExecutor. However, if
the function is defined in top-level in Shell, and if I declared an
implicit var for SparkContext like this:
implicit val sparkContext = sc
Then calling txnSentTo in map or mapPartitions will cause a
NotSerializableExce
Erm, you are trying to do all the work in the create() method. This is
definitely not what you want to do. It is just supposed to make the
JavaSparkStreamingContext. A further problem is that you're using
anonymous inner classes, which are non-static and contain a reference
to the outer class. The
HI Sean,
Below is my java code and using spark 1.1.0. Still getting the same error.
Here Bean class is serialized. Not sure where exactly is the problem.
What am I doing wrong here ?
public class StreamingJson {
public static void main(String[] args) throws Exception {
final String HDFS_FILE_LOC
No, not the same thing then. This just means you accidentally have a
reference to the unserializable enclosing test class in your code.
Just make sure the reference is severed.
On Thu, Nov 6, 2014 at 8:00 AM, Vasu C wrote:
> Thanks for pointing to the issue.
>
> Yes I think its the same issue, be
Thanks for pointing to the issue.
Yes I think its the same issue, below is Exception
ERROR OneForOneStrategy: TestCheckpointStreamingJson$1
java.io.NotSerializableException: TestCheckpointStreamingJson
at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183)
at
java.io.ObjectOutp
You didn't say what isn't serializable or where the exception occurs,
but, is it the same as this issue?
https://issues.apache.org/jira/browse/SPARK-4196
On Thu, Nov 6, 2014 at 5:42 AM, Vasu C wrote:
> Dear All,
>
> I am getting java.io.NotSerializableException for below code. if
> jssc.checkpoi
Dear All,
I am getting java.io.NotSerializableException for below code. if
jssc.checkpoint(HDFS_CHECKPOINT_DIR); is blocked there is not exception
Please help
JavaStreamingContextFactory contextFactory = new
JavaStreamingContextFactory() {
@Override
public JavaStreamingContext create() {
SparkC
Hello All,
I am trying to query a Hive table using Spark SQL from my java code,but
getting the following error:
*Caused by: org.apache.spark.SparkException: Job aborted due to stage
failure: Task not serializable: java.io.NotSerializableException:
org.apache.spark.sql.hive.api.java.JavaHiveCo
t;, "x7", "x8", "x9"))
Here the data field is org.apache.spark.rdd.RDD[LogLine] = MappedRDD[7] at
map at :45
How can I convert this to Serializable, or is this a different problem?
Please advise.
Regards,
lmk
--
View this message in context:
http://apache-spark-us
text:
http://apache-spark-user-list.1001560.n3.nabble.com/iterator-cause-NotSerializableException-tp12638.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-unsubscr...@spar
Thanks for help
On Aug 21, 2014, at 10:56, Yin Huai wrote:
> If you want to filter the table name, you can use
>
> hc.sql("show tables").filter(row => !"test".equals(row.getString(0
>
> Seems making functionRegistry transient can fix the error.
>
>
> On Wed, Aug 20, 2014 at 8:53 PM,
PR is https://github.com/apache/spark/pull/2074.
--
From: Yin Huai
Sent: 8/20/2014 10:56 PM
To: Vida Ha
Cc: tianyi ; Fengyun RAO ;
user@spark.apache.org
Subject: Re: Got NotSerializableException when access broadcast variable
If you want to filter the table name
If you want to filter the table name, you can use
hc.sql("show tables").filter(row => !"test".equals(row.getString(0
Seems making functionRegistry transient can fix the error.
On Wed, Aug 20, 2014 at 8:53 PM, Vida Ha wrote:
> Hi,
>
> I doubt the the broadcast variable is your problem, sin
Hi,
I doubt the the broadcast variable is your problem, since you are seeing:
org.apache.spark.SparkException: Task not serializable
Caused by: java.io.NotSerializableException: org.apache.spark.sql
.hive.HiveContext$$anon$3
We have a knowledgebase article that explains why this happens - it's a
Thanks for help.
I run this script again with "bin/spark-shell --conf
spark.serializer=org.apache.spark.serializer.KryoSerializer”
in the console, I can see:
scala> sc.getConf.getAll.foreach(println)
(spark.tachyonStore.folderName,spark-eaabe986-03cb-41bd-bde5-993c7db3f048)
(spark.driver.host,1
Hi everyone!
I got a exception when i run my script with spark-shell:
I added
SPARK_JAVA_OPTS="-Dsun.io.serialization.extendedDebugInfo=true"
in spark-env.sh to show the following stack:
org.apache.spark.SparkException: Task not serializable
at
org.apache.spark.util.ClosureCleaner$.
Hi,
I took avro 1.7.7 and recompiled my distribution to be able to fix the issue
when dealing with avro GenericRecord. The issue I got was resolved. I'm
referring to AVRO-1476.
I also enabled kryo registration in SparkConf.
That said, I am still seeing a NotSerializableExceptio
I am trying to serialize objects contained in RDDs using runtime relfection
via TypeTag. However, the Spark job keeps
failing java.io.NotSerializableException on an instance of TypeCreator
(auto generated by compiler to enable TypeTags). Is there any workaround
for this without switching to scala 2
Yep, here goes!
Here are my environment vitals:
- Spark 1.0.0
- EC2 cluster with 1 slave spun up using spark-ec2
- twitter4j 3.0.3
- spark-shell called with --jars argument to load
spark-streaming-twitter_2.10-1.0.0.jar as well as all the twitter4j
jars.
Now, while I’m in the S
I am very curious though. Can you post a concise code example which we can
run to reproduce this problem?
TD
On Tue, Jul 15, 2014 at 2:54 PM, Tathagata Das
wrote:
> I am not entire sure off the top of my head. But a possible (usually
> works) workaround is to define the function as a val inste
I am not entire sure off the top of my head. But a possible (usually works)
workaround is to define the function as a val instead of a def. For example
def func(i: Int): Boolean = { true }
can be written as
val func = (i: Int) => { true }
Hope this helps for now.
TD
On Tue, Jul 15, 2014 at 9
Hey Diana,
Did you ever figure this out?
I’m running into the same exception, except in my case the function I’m
calling is a KMeans model.predict().
In regular Spark it works, and Spark Streaming without the call to
model.predict() also works, but when put together I get this serialization
exce
Hi,
I am writing a standalone Spark program that gets its data from Cassandra.
I followed the examples and created the RDD via the newAPIHadoopRDD() and
the ColumnFamilyInputFormat class.
The RDD is created, but I get a NotSerializableException when I call the
RDD's .groupByKey() method:
p
How is your RDD created? It might mean that something used in the process of
creating it was not serializable.
Matei
On Jun 3, 2014, at 10:11 PM, bluejoe2008 wrote:
> when i called KMeans.train(), an error happened:
>
> 14/06/04 13:02:29 INFO scheduler.DAGScheduler: Submitting Stage 3
> (Ma
when i called KMeans.train(), an error happened:
14/06/04 13:02:29 INFO scheduler.DAGScheduler: Submitting Stage 3
(MappedRDD[12] at map at KMeans.scala:123), which has no missing parents
14/06/04 13:02:29 INFO scheduler.DAGScheduler: Failed to run takeSample at
KMeans.scala:260
Exception in thr
Hey all, trying to set up a pretty simple streaming app and getting some
weird behavior.
First, a non-streaming job that works fine: I'm trying to pull out lines
of a log file that match a regex, for which I've set up a function:
def getRequestDoc(s: String):
String = { "KBDOC-[0-9]*".r.find
52 matches
Mail list logo