You cannot pass your jobConf object inside any of the transformation function
in spark (like map, mapPartitions, etc.) since
org.apache.hadoop.mapreduce.Job is not Serializable. You can use
KryoSerializer (See this doc
http://spark.apache.org/docs/latest/tuning.html#data-serialization), We
usuall
within
closure.
You can refer to org.apache.spark.rdd.HadoopRDD, there is a similar usage
scenario like yours.
Thanks
Jerry.
From: Tobias Pfeiffer [mailto:t...@preferred.jp]
Sent: Friday, December 26, 2014 9:38 AM
To: ey-chih chow
Cc: user
Subject: Re: serialization issue with mapPartitions
Hi
Hi,
On Fri, Dec 26, 2014 at 10:13 AM, ey-chih chow wrote:
> I should rephrase my question as follows:
>
> How to use the corresponding Hadoop Configuration of a HadoopRDD in
> defining
> a function as an input parameter to the MapPartitions function?
>
Well, you could try to pull the `val confi
I should rephrase my question as follows:
How to use the corresponding Hadoop Configuration of a HadoopRDD in defining
a function as an input parameter to the MapPartitions function?
Thanks.
Ey-Chih Chow
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/ser
Hi,
On Fri, Dec 26, 2014 at 1:32 AM, ey-chih chow wrote:
>
> I got some issues with mapPartitions with the following piece of code:
>
> val sessions = sc
> .newAPIHadoopFile(
> "... path to an avro file ...",
> classOf[org.apache.avro.mapreduce.AvroKeyInputFormat[ByteBuf