Hi, I am loading data from Hive table to Hbase after doing some manipulation.
I am getting error as 'Task not Serializable'. My code is as below. public class HiveToHbaseLoader implements Serializable { public static void main(String[] args) throws Exception { String hbaseTableName = args[0]; String hiveQuery = args[1]; SparkConf conf = new SparkConf().setAppName("Hive to Hbase Loader") .setMaster("????"); JavaSparkContext sc = new JavaSparkContext(conf); HiveContext hiveContext = new HiveContext(sc.sc()); hiveContext.setConf("hive.metastore.uris", "?????"); DataFrame dataFrame = hiveContext.sql(hiveQuery); dataFrame .foreachPartition(new AbstractFunction1<scala.collection.Iterator<Row>, BoxedUnit>() { //Logic to load row from hive to Hbase. } }} Getting error as below. Exception in thread "main" org.apache.spark.SparkException: Task not serializable at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:166) at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:158) at org.apache.spark.SparkContext.clean(SparkContext.scala:1623) at org.apache.spark.rdd.RDD.foreachPartition(RDD.scala:805) at org.apache.spark.sql.DataFrame.foreachPartition(DataFrame.scala:875) at com.philips.bda.HiveToHbaseLoader.main(HiveToHbaseLoader.java:46) Caused by: java.io.NotSerializableException: com.philips.bda.HiveToHbaseLoader$1 Serialization stack: - object not serializable (class: com.philips.bda.HiveToHbaseLoader$1, value: <function1>) at org.apache.spark.serializer.SerializationDebugger$.improveException(SerializationDebugger.scala:38) at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:47) at org.apache.spark.serializer.JavaSerializerInstance.serialize(JavaSerializer.scala:80) at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:164) ... 5 more -- Regards, Nishant