i believe the error is related to an
org.apache.spark.sql.expressions.Aggregator where the buffer type (BUF) is
Array[Int]

On Wed, Apr 12, 2017 at 4:19 PM, Koert Kuipers <ko...@tresata.com> wrote:

> hey all,
> today i tried upgrading the spark version we use internally by creating a
> new internal release from the spark master branch. last time i did this was
> march 7.
>
> with this updated spark i am seeing some serialization errors in the unit
> tests for our own libraries. looks like a scala reflection type that is not
> serializable is getting sucked into serialization for the encoder?
> see below.
> best,
> koert
>
> [info]   org.apache.spark.SparkException: Task not serializable
> [info]   at org.apache.spark.util.ClosureCleaner$.ensureSerializable(
> ClosureCleaner.scala:298)
> [info]   at org.apache.spark.util.ClosureCleaner$.org$apache$
> spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288)
> [info]   at org.apache.spark.util.ClosureCleaner$.clean(
> ClosureCleaner.scala:108)
> [info]   at org.apache.spark.SparkContext.clean(SparkContext.scala:2284)
> [info]   at org.apache.spark.SparkContext.runJob(SparkContext.scala:2058)
> ...
> [info] Serialization stack:
> [info]     - object not serializable (class: 
> scala.reflect.internal.BaseTypeSeqs$BaseTypeSeq,
> value: BTS(Int,AnyVal,Any))
> [info]     - field (class: scala.reflect.internal.Types$TypeRef, name:
> baseTypeSeqCache, type: class scala.reflect.internal.
> BaseTypeSeqs$BaseTypeSeq)
> [info]     - object (class scala.reflect.internal.Types$ClassNoArgsTypeRef,
> Int)
> [info]     - field (class: 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6,
> name: elementType$2, type: class scala.reflect.api.Types$TypeApi)
> [info]     - object (class 
> org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$6,
> <function1>)
> [info]     - field (class: org.apache.spark.sql.catalyst.
> expressions.objects.UnresolvedMapObjects, name: function, type: interface
> scala.Function1)
> [info]     - object (class org.apache.spark.sql.catalyst.
> expressions.objects.UnresolvedMapObjects, unresolvedmapobjects(<function1>,
> getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface
> scala.collection.Seq)))
> [info]     - field (class: org.apache.spark.sql.catalyst.
> expressions.objects.WrapOption, name: child, type: class
> org.apache.spark.sql.catalyst.expressions.Expression)
> [info]     - object (class org.apache.spark.sql.catalyst.
> expressions.objects.WrapOption, wrapoption(unresolvedmapobjects(<function1>,
> getcolumnbyordinal(0, ArrayType(IntegerType,false)), Some(interface
> scala.collection.Seq)), ObjectType(interface scala.collection.Seq)))
> [info]     - writeObject data (class: scala.collection.immutable.
> List$SerializationProxy)
> [info]     - object (class scala.collection.immutable.List$SerializationProxy,
> scala.collection.immutable.List$SerializationProxy@69040c85)
> [info]     - writeReplace data (class: scala.collection.immutable.
> List$SerializationProxy)
> [info]     - object (class scala.collection.immutable.$colon$colon,
> List(wrapoption(unresolvedmapobjects(<function1>, getcolumnbyordinal(0,
> ArrayType(IntegerType,false)), Some(interface scala.collection.Seq)),
> ObjectType(interface scala.collection.Seq))))
> [info]     - field (class: org.apache.spark.sql.catalyst.
> expressions.objects.NewInstance, name: arguments, type: interface
> scala.collection.Seq)
> [info]     - object (class org.apache.spark.sql.catalyst.
> expressions.objects.NewInstance, newInstance(class scala.Tuple1))
> [info]     - field (class: 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder,
> name: deserializer, type: class org.apache.spark.sql.catalyst.
> expressions.Expression)
> [info]     - object (class 
> org.apache.spark.sql.catalyst.encoders.ExpressionEncoder,
> class[_1[0]: array<int>])
> ...
>
>

Reply via email to