Hi,

I got some issues with mapPartitions with the following piece of code:

    val sessions = sc
      .newAPIHadoopFile(
        "... path to an avro file ...",
        classOf[org.apache.avro.mapreduce.AvroKeyInputFormat[ByteBuffer]],
        classOf[AvroKey[ByteBuffer]],
        classOf[NullWritable],
        job.getConfiguration())
      .mapPartitions { valueIterator =>
        val config = job.getConfiguration()
                         .
                         .
                         .
      }
      .collect()

Why job.getConfiguration() in the function mapPartitions will generate the
following message?

Cause: java.io.NotSerializableException: org.apache.hadoop.mapreduce.Job

If I take out 'val config = job.getConfiguration()' in the mapPartitions,
the code works fine, even through 
job.getConfiguration() shows up also in newAPIHadoopFile().

Ey-Chih Chow



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/serialization-issue-with-mapPartitions-tp20858.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to