NotSerializableException with Trait

2018-02-23 Thread Jean Rossier
Hello, I have a few spark jobs that are doing the same aggregations. I want to factorize the aggregation logic. For that I want to use a Trait. When I run this job extending my Trait (over yarn, in client mode), I get a NotSerializableException (in attachment). If I change my Trait to an Object

NotSerializableException in DStream.transform

2016-10-04 Thread Andrew A
Hi All! I'm using Spark 1.6.1 and I'm trying to transform my DStream as follows: myStream.transorm { rdd => val sqlContext = SQLContext.getOrCreate(rdd.sparkContext) import sqlContext.implicits._ val j = rdd.toDS() j.map { case a => Some(...) case _ =

Re: Spark Streaming - NotSerializableException: Methods & Closures:

2016-04-08 Thread jamborta
> If you reply to this email, your message will be added to the discussion > below: > > http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-NotSerializableException-Methods-Closures-tp26672p26718.html > To unsubscribe from Spark Streaming - NotSerializabl

Re: Spark Streaming - NotSerializableException: Methods & Closures:

2016-04-08 Thread mpawashe
The class declaration is already marked Serializable ("with Serializable") -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-NotSerializableException-Methods-Closures-tp26672p26718.html Sent from the Apache Spark User List mailing li

Re: Spark Streaming - NotSerializableException: Methods & Closures:

2016-04-06 Thread jamborta
you can declare you class serializable, as spark would want to serialise the whole class. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-NotSerializableException-Methods-Closures-tp26672p26689.html Sent from the Apache Spark User List

Re: Spark Streaming - NotSerializableException: Methods & Closures:

2016-04-05 Thread Mayur Pawashe
> >> at >> org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:40) >> at >> org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47) >> at >> org.apache.spar

Re: Spark Streaming - NotSerializableException: Methods & Closures:

2016-04-04 Thread Ted Yu
cala:47) > at > > org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:84) > at > > org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:301) > ... 20 more > > > > -- > View this message

Spark Streaming - NotSerializableException: Methods & Closures:

2016-04-04 Thread mpawashe
he-spark-user-list.1001560.n3.nabble.com/Spark-Streaming-NotSerializableException-Methods-Closures-tp26672.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@

Re: Using a non serializable third party JSON serializable on a spark worker node throws NotSerializableException

2016-03-01 Thread Shixiong(Ryan) Zhu
ogger.error("Failed to send >>> pageview", e) >>> } >>> } >>> } >>> >>> argonaut relys on a user implementation of a trait called >>> `EncodeJson[T]`, >>> which tells argonaut how to serialize and de

Re: Using a non serializable third party JSON serializable on a spark worker node throws NotSerializableException

2016-03-01 Thread Yuval Itzchakov
gt;> pageview", e) >> } >> } >> } >> >> argonaut relys on a user implementation of a trait called `EncodeJson[T]`, >> which tells argonaut how to serialize and deserialize the object. >> >> The problem is, that the trait `Encod

Re: Using a non serializable third party JSON serializable on a spark worker node throws NotSerializableException

2016-03-01 Thread Shixiong(Ryan) Zhu
how to serialize and deserialize the object. > > The problem is, that the trait `EncodeJson[T]` is not serializable, thus > throwing a NotSerializableException: > > Caused by: java.io.NotSerializableException: argonaut.EncodeJson$$anon$2 > Serialization stack: > - ob

Using a non serializable third party JSON serializable on a spark worker node throws NotSerializableException

2016-03-01 Thread Yuval.Itzchakov
eJson[T]` is not serializable, thus throwing a NotSerializableException: Caused by: java.io.NotSerializableException: argonaut.EncodeJson$$anon$2 Serialization stack: - object not serializable (class: argonaut.EncodeJson$$anon$2, value: argonaut.EncodeJson$$anon$2@6415f61e) This is obv

Re: NotSerializableException exception while using TypeTag in Scala 2.10

2016-01-01 Thread Yanbo Liang
I also hit this bug, have you resolved this issue? Or could you give some suggestions? 2014-07-28 18:33 GMT+08:00 Aniket Bhatnagar : > I am trying to serialize objects contained in RDDs using runtime > relfection via TypeTag. However, the Spark job keeps > failing java.io.NotSerializableException

Re: RegressionModelEvaluator (from jpmml) NotSerializableException when instantiated in the driver

2015-12-09 Thread Utkarsh Sengar
t to the executors. But I get a NotSerializableException for > org.xml.sax.helpers.LocatorImpl which is used inside jpmml. > > I have this class Prediction.java: > public class Prediction implements Serializable { > private RegressionModelEvaluator rme; > > public Pred

RegressionModelEvaluator (from jpmml) NotSerializableException when instantiated in the driver

2015-12-09 Thread Utkarsh Sengar
I am trying to load a PMML file in a spark job. Instantiate it only once and pass it to the executors. But I get a NotSerializableException for org.xml.sax.helpers.LocatorImpl which is used inside jpmml. I have this class Prediction.java: public class Prediction implements Serializable

Re: NotSerializableException in spark 1.4.0

2015-07-15 Thread Chen Song
Ah, cool. Thanks. On Wed, Jul 15, 2015 at 5:58 PM, Tathagata Das wrote: > Spark 1.4.1 just got released! So just download that. Yay for timing. > > On Wed, Jul 15, 2015 at 2:47 PM, Ted Yu wrote: > >> Should be this one: >> [SPARK-7180] [SPARK-8090] [SPARK-8091] Fix a number of >> Serializat

Re: NotSerializableException in spark 1.4.0

2015-07-15 Thread Tathagata Das
Spark 1.4.1 just got released! So just download that. Yay for timing. On Wed, Jul 15, 2015 at 2:47 PM, Ted Yu wrote: > Should be this one: > [SPARK-7180] [SPARK-8090] [SPARK-8091] Fix a number of > SerializationDebugger bugs and limitations > ... > Closes #6625 from tdas/SPARK-7180 and s

Re: NotSerializableException in spark 1.4.0

2015-07-15 Thread Ted Yu
Should be this one: [SPARK-7180] [SPARK-8090] [SPARK-8091] Fix a number of SerializationDebugger bugs and limitations ... Closes #6625 from tdas/SPARK-7180 and squashes the following commits: On Wed, Jul 15, 2015 at 2:37 PM, Chen Song wrote: > Thanks > > Can you point me to the patch to

Re: NotSerializableException in spark 1.4.0

2015-07-15 Thread Chen Song
Thanks Can you point me to the patch to fix the serialization stack? Maybe I can pull it in and rerun my job. Chen On Wed, Jul 15, 2015 at 4:40 PM, Tathagata Das wrote: > Your streaming job may have been seemingly running ok, but the DStream > checkpointing must have been failing in the backgr

Re: NotSerializableException in spark 1.4.0

2015-07-15 Thread Tathagata Das
Your streaming job may have been seemingly running ok, but the DStream checkpointing must have been failing in the background. It would have been visible in the log4j logs. In 1.4.0, we enabled fast-failure for that so that checkpointing failures dont get hidden in the background. The fact that th

Re: NotSerializableException in spark 1.4.0

2015-07-15 Thread Ted Yu
Can you show us your function(s) ? Thanks On Wed, Jul 15, 2015 at 12:46 PM, Chen Song wrote: > The streaming job has been running ok in 1.2 and 1.3. After I upgraded to > 1.4, I started seeing error as below. It appears that it fails in validate > method in StreamingContext. Is there anything c

NotSerializableException in spark 1.4.0

2015-07-15 Thread Chen Song
The streaming job has been running ok in 1.2 and 1.3. After I upgraded to 1.4, I started seeing error as below. It appears that it fails in validate method in StreamingContext. Is there anything changed on 1.4.0 w.r.t DStream checkpointint? Detailed error from driver: 15/07/15 18:00:39 ERROR yarn

Re: NotSerializableException: org.apache.http.impl.client.DefaultHttpClient when trying to send documents to Solr

2015-02-18 Thread Dmitry Goldenberg
al Message- > From: dgoldenberg [mailto:dgoldenberg...@gmail.com] > Sent: Wednesday, February 18, 2015 1:54 PM > To: user@spark.apache.org > Subject: NotSerializableException: > org.apache.http.impl.client.DefaultHttpClient when trying to send documents > to Solr > > I

RE: NotSerializableException: org.apache.http.impl.client.DefaultHttpClient when trying to send documents to Solr

2015-02-18 Thread Jose Fernandez
, please contact the sender and destroy any copies of this information. -Original Message- From: dgoldenberg [mailto:dgoldenberg...@gmail.com] Sent: Wednesday, February 18, 2015 1:54 PM To: user@spark.apache.org Subject: NotSerializableException: org.apache.http.impl.client.DefaultHttpClient

NotSerializableException: org.apache.http.impl.client.DefaultHttpClient when trying to send documents to Solr

2015-02-18 Thread dgoldenberg
I'm using Solrj in a Spark program. When I try to send the docs to Solr, I get the NotSerializableException on the DefaultHttpClient. Is there a possible fix or workaround? I'm using Spark 1.2.1 with Hadoop 2.4, SolrJ is version 4.0.0. final HttpSolrServer solrServer = new Http

Re: NotSerializableException in Spark Streaming

2014-12-15 Thread Nicholas Chammas
This still seems to be broken. In 1.1.1, it errors immediately on this line (from the above repro script): liveTweets.map(t => noop(t)).print() The stack trace is: org.apache.spark.SparkException: Task not serializable at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureClean

NotSerializableException caused by Implicit sparkContext or sparkStreamingContext, why?

2014-11-18 Thread Jianshi Huang
[Edge] = { ... } And it takes an implicit argument of type GraphQueryExecutor. However, if the function is defined in top-level in Shell, and if I declared an implicit var for SparkContext like this: implicit val sparkContext = sc Then calling txnSentTo in map or mapPartitions will cause a NotSerializableExce

Re: JavaStreamingContextFactory checkpoint directory NotSerializableException

2014-11-06 Thread Sean Owen
Erm, you are trying to do all the work in the create() method. This is definitely not what you want to do. It is just supposed to make the JavaSparkStreamingContext. A further problem is that you're using anonymous inner classes, which are non-static and contain a reference to the outer class. The

Re: JavaStreamingContextFactory checkpoint directory NotSerializableException

2014-11-06 Thread Vasu C
HI Sean, Below is my java code and using spark 1.1.0. Still getting the same error. Here Bean class is serialized. Not sure where exactly is the problem. What am I doing wrong here ? public class StreamingJson { public static void main(String[] args) throws Exception { final String HDFS_FILE_LOC

Re: JavaStreamingContextFactory checkpoint directory NotSerializableException

2014-11-06 Thread Sean Owen
No, not the same thing then. This just means you accidentally have a reference to the unserializable enclosing test class in your code. Just make sure the reference is severed. On Thu, Nov 6, 2014 at 8:00 AM, Vasu C wrote: > Thanks for pointing to the issue. > > Yes I think its the same issue, be

Re: JavaStreamingContextFactory checkpoint directory NotSerializableException

2014-11-06 Thread Vasu C
Thanks for pointing to the issue. Yes I think its the same issue, below is Exception ERROR OneForOneStrategy: TestCheckpointStreamingJson$1 java.io.NotSerializableException: TestCheckpointStreamingJson at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1183) at java.io.ObjectOutp

Re: JavaStreamingContextFactory checkpoint directory NotSerializableException

2014-11-05 Thread Sean Owen
You didn't say what isn't serializable or where the exception occurs, but, is it the same as this issue? https://issues.apache.org/jira/browse/SPARK-4196 On Thu, Nov 6, 2014 at 5:42 AM, Vasu C wrote: > Dear All, > > I am getting java.io.NotSerializableException for below code. if > jssc.checkpoi

JavaStreamingContextFactory checkpoint directory NotSerializableException

2014-11-05 Thread Vasu C
Dear All, I am getting java.io.NotSerializableException for below code. if jssc.checkpoint(HDFS_CHECKPOINT_DIR); is blocked there is not exception Please help JavaStreamingContextFactory contextFactory = new JavaStreamingContextFactory() { @Override public JavaStreamingContext create() { SparkC

NotSerializableException: org.apache.spark.sql.hive.api.java.JavaHiveContext

2014-09-04 Thread Bijoy Deb
Hello All, I am trying to query a Hive table using Spark SQL from my java code,but getting the following error: *Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException: org.apache.spark.sql.hive.api.java.JavaHiveCo

NotSerializableException while doing rdd.saveToCassandra

2014-08-27 Thread lmk
t;, "x7", "x8", "x9")) Here the data field is org.apache.spark.rdd.RDD[LogLine] = MappedRDD[7] at map at :45 How can I convert this to Serializable, or is this a different problem? Please advise. Regards, lmk -- View this message in context: http://apache-spark-us

iterator cause NotSerializableException

2014-08-22 Thread Kevin Jung
text: http://apache-spark-user-list.1001560.n3.nabble.com/iterator-cause-NotSerializableException-tp12638.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spar

Re: Got NotSerializableException when access broadcast variable

2014-08-20 Thread tianyi
Thanks for help On Aug 21, 2014, at 10:56, Yin Huai wrote: > If you want to filter the table name, you can use > > hc.sql("show tables").filter(row => !"test".equals(row.getString(0 > > Seems making functionRegistry transient can fix the error. > > > On Wed, Aug 20, 2014 at 8:53 PM,

RE: Got NotSerializableException when access broadcast variable

2014-08-20 Thread Yin Huai
PR is https://github.com/apache/spark/pull/2074. -- From: Yin Huai Sent: ‎8/‎20/‎2014 10:56 PM To: Vida Ha Cc: tianyi ; Fengyun RAO ; user@spark.apache.org Subject: Re: Got NotSerializableException when access broadcast variable If you want to filter the table name

Re: Got NotSerializableException when access broadcast variable

2014-08-20 Thread Yin Huai
If you want to filter the table name, you can use hc.sql("show tables").filter(row => !"test".equals(row.getString(0 Seems making functionRegistry transient can fix the error. On Wed, Aug 20, 2014 at 8:53 PM, Vida Ha wrote: > Hi, > > I doubt the the broadcast variable is your problem, sin

Re: Got NotSerializableException when access broadcast variable

2014-08-20 Thread Vida Ha
Hi, I doubt the the broadcast variable is your problem, since you are seeing: org.apache.spark.SparkException: Task not serializable Caused by: java.io.NotSerializableException: org.apache.spark.sql .hive.HiveContext$$anon$3 We have a knowledgebase article that explains why this happens - it's a

Re: Got NotSerializableException when access broadcast variable

2014-08-20 Thread tianyi
Thanks for help. I run this script again with "bin/spark-shell --conf spark.serializer=org.apache.spark.serializer.KryoSerializer” in the console, I can see: scala> sc.getConf.getAll.foreach(println) (spark.tachyonStore.folderName,spark-eaabe986-03cb-41bd-bde5-993c7db3f048) (spark.driver.host,1

Got NotSerializableException when access broadcast variable

2014-08-19 Thread 田毅
Hi everyone! I got a exception when i run my script with spark-shell: I added SPARK_JAVA_OPTS="-Dsun.io.serialization.extendedDebugInfo=true" in spark-env.sh to show the following stack: org.apache.spark.SparkException: Task not serializable at org.apache.spark.util.ClosureCleaner$.

NotSerializableException

2014-07-30 Thread Ron Gonzalez
Hi, I took avro 1.7.7 and recompiled my distribution to be able to fix the issue when dealing with avro GenericRecord. The issue I got was resolved. I'm referring to AVRO-1476. I also enabled kryo registration in SparkConf. That said, I am still seeing a NotSerializableExceptio

NotSerializableException exception while using TypeTag in Scala 2.10

2014-07-28 Thread Aniket Bhatnagar
I am trying to serialize objects contained in RDDs using runtime relfection via TypeTag. However, the Spark job keeps failing java.io.NotSerializableException on an instance of TypeCreator (auto generated by compiler to enable TypeTags). Is there any workaround for this without switching to scala 2

Re: NotSerializableException in Spark Streaming

2014-07-24 Thread Nicholas Chammas
Yep, here goes! Here are my environment vitals: - Spark 1.0.0 - EC2 cluster with 1 slave spun up using spark-ec2 - twitter4j 3.0.3 - spark-shell called with --jars argument to load spark-streaming-twitter_2.10-1.0.0.jar as well as all the twitter4j jars. Now, while I’m in the S

Re: NotSerializableException in Spark Streaming

2014-07-15 Thread Tathagata Das
I am very curious though. Can you post a concise code example which we can run to reproduce this problem? TD On Tue, Jul 15, 2014 at 2:54 PM, Tathagata Das wrote: > I am not entire sure off the top of my head. But a possible (usually > works) workaround is to define the function as a val inste

Re: NotSerializableException in Spark Streaming

2014-07-15 Thread Tathagata Das
I am not entire sure off the top of my head. But a possible (usually works) workaround is to define the function as a val instead of a def. For example def func(i: Int): Boolean = { true } can be written as val func = (i: Int) => { true } Hope this helps for now. TD On Tue, Jul 15, 2014 at 9

Re: NotSerializableException in Spark Streaming

2014-07-15 Thread Nicholas Chammas
Hey Diana, Did you ever figure this out? I’m running into the same exception, except in my case the function I’m calling is a KMeans model.predict(). In regular Spark it works, and Spark Streaming without the call to model.predict() also works, but when put together I get this serialization exce

Spark and Cassandra - NotSerializableException

2014-06-25 Thread shaiw75
Hi, I am writing a standalone Spark program that gets its data from Cassandra. I followed the examples and created the RDD via the newAPIHadoopRDD() and the ColumnFamilyInputFormat class. The RDD is created, but I get a NotSerializableException when I call the RDD's .groupByKey() method: p

Re: KMeans.train() throws NotSerializableException

2014-06-03 Thread Matei Zaharia
How is your RDD created? It might mean that something used in the process of creating it was not serializable. Matei On Jun 3, 2014, at 10:11 PM, bluejoe2008 wrote: > when i called KMeans.train(), an error happened: > > 14/06/04 13:02:29 INFO scheduler.DAGScheduler: Submitting Stage 3 > (Ma

KMeans.train() throws NotSerializableException

2014-06-03 Thread bluejoe2008
when i called KMeans.train(), an error happened: 14/06/04 13:02:29 INFO scheduler.DAGScheduler: Submitting Stage 3 (MappedRDD[12] at map at KMeans.scala:123), which has no missing parents 14/06/04 13:02:29 INFO scheduler.DAGScheduler: Failed to run takeSample at KMeans.scala:260 Exception in thr

NotSerializableException in Spark Streaming

2014-05-14 Thread Diana Carroll
Hey all, trying to set up a pretty simple streaming app and getting some weird behavior. First, a non-streaming job that works fine: I'm trying to pull out lines of a log file that match a regex, for which I've set up a function: def getRequestDoc(s: String): String = { "KBDOC-[0-9]*".r.find