Michael - it is already transient. This should probably considered a bug in the scala compiler, but we can easily work around it by removing the use of destructuring binding.
On Mon, Feb 16, 2015 at 10:41 AM, Michael Armbrust <mich...@databricks.com> wrote: > I'd suggest marking the HiveContext as @transient since its not valid to > use it on the slaves anyway. > > On Mon, Feb 16, 2015 at 4:27 AM, Haopu Wang <hw...@qilinsoft.com> wrote: > > > When I'm investigating this issue (in the end of this email), I take a > > look at HiveContext's code and find this change > > (https://github.com/apache/spark/commit/64945f868443fbc59cb34b34c16d782d > > da0fb63d#diff-ff50aea397a607b79df9bec6f2a841db): > > > > > > > > - @transient protected[hive] lazy val hiveconf = new > > HiveConf(classOf[SessionState]) > > > > - @transient protected[hive] lazy val sessionState = { > > > > - val ss = new SessionState(hiveconf) > > > > - setConf(hiveconf.getAllProperties) // Have SQLConf pick up the > > initial set of HiveConf. > > > > - ss > > > > - } > > > > + @transient protected[hive] lazy val (hiveconf, sessionState) = > > > > + Option(SessionState.get()) > > > > + .orElse { > > > > > > > > With the new change, Scala compiler always generate a Tuple2 field of > > HiveContext as below: > > > > > > > > private Tuple2 x$3; > > > > private transient OutputStream outputBuffer; > > > > private transient HiveConf hiveconf; > > > > private transient SessionState sessionState; > > > > private transient HiveMetastoreCatalog catalog; > > > > > > > > That "x$3" field's key is HiveConf object that cannot be serialized. So > > can you suggest how to resolve this issue? Thank you very much! > > > > > > > > ================================ > > > > > > > > I have a streaming application which registered temp table on a > > HiveContext for each batch duration. > > > > The application runs well in Spark 1.1.0. But I get below error from > > 1.1.1. > > > > Do you have any suggestions to resolve it? Thank you! > > > > > > > > java.io.NotSerializableException: org.apache.hadoop.hive.conf.HiveConf > > > > - field (class "scala.Tuple2", name: "_1", type: "class > > java.lang.Object") > > > > - object (class "scala.Tuple2", (Configuration: core-default.xml, > > core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, > > yarn-site.xml, hdfs-default.xml, hdfs-site.xml, > > org.apache.hadoop.hive.conf.LoopingByteArrayInputStream@2158ce23,org.apa > > che.hadoop.hive.ql.session.SessionState@49b6eef9)) > > > > - field (class "org.apache.spark.sql.hive.HiveContext", name: "x$3", > > type: "class scala.Tuple2") > > > > - object (class "org.apache.spark.sql.hive.HiveContext", > > org.apache.spark.sql.hive.HiveContext@4e6e66a4) > > > > - field (class > > "example.BaseQueryableDStream$$anonfun$registerTempTable$2", name: > > "sqlContext$1", type: "class org.apache.spark.sql.SQLContext") > > > > - object (class > > "example.BaseQueryableDStream$$anonfun$registerTempTable$2", > > <function1>) > > > > - field (class > > "org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1", > > name: "foreachFunc$1", type: "interface scala.Function1") > > > > - object (class > > "org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1", > > <function2>) > > > > - field (class "org.apache.spark.streaming.dstream.ForEachDStream", > > name: "org$apache$spark$streaming$dstream$ForEachDStream$$foreachFunc", > > type: "interface scala.Function2") > > > > - object (class "org.apache.spark.streaming.dstream.ForEachDStream", > > org.apache.spark.streaming.dstream.ForEachDStream@5ccbdc20) > > > > - element of array (index: 0) > > > > - array (class "[Ljava.lang.Object;", size: 16) > > > > - field (class "scala.collection.mutable.ArrayBuffer", name: > > "array", type: "class [Ljava.lang.Object;") > > > > - object (class "scala.collection.mutable.ArrayBuffer", > > ArrayBuffer(org.apache.spark.streaming.dstream.ForEachDStream@5ccbdc20)) > > > > - field (class "org.apache.spark.streaming.DStreamGraph", name: > > "outputStreams", type: "class scala.collection.mutable.ArrayBuffer") > > > > - custom writeObject data (class > > "org.apache.spark.streaming.DStreamGraph") > > > > - object (class "org.apache.spark.streaming.DStreamGraph", > > org.apache.spark.streaming.DStreamGraph@776ae7da) > > > > - field (class "org.apache.spark.streaming.Checkpoint", name: > > "graph", type: "class org.apache.spark.streaming.DStreamGraph") > > > > - root object (class "org.apache.spark.streaming.Checkpoint", > > org.apache.spark.streaming.Checkpoint@5eade065) > > > > at java.io.ObjectOutputStream.writeObject0(Unknown Source) > > > > > > > > > > > > > > > > >