Re: reuse hadoop code in Spark

2014-06-05 Thread Matei Zaharia
ibm.com/person/us-wtan > > > > From:Matei Zaharia > To: user@spark.apache.org, > Date:06/04/2014 04:28 PM > Subject:Re: reuse hadoop code in Spark > > > > Yes, you can write some glue in Spark to call these. Some functions to look

Re: reuse hadoop code in Spark

2014-06-05 Thread Wei Tan
://researcher.ibm.com/person/us-wtan From: Matei Zaharia To: user@spark.apache.org, Date: 06/04/2014 04:28 PM Subject:Re: reuse hadoop code in Spark Yes, you can write some glue in Spark to call these. Some functions to look at: - SparkContext.hadoopRDD lets you create an input RDD from

Re: reuse hadoop code in Spark

2014-06-04 Thread Matei Zaharia
Yes, you can write some glue in Spark to call these. Some functions to look at: - SparkContext.hadoopRDD lets you create an input RDD from an existing JobConf configured by Hadoop (including InputFormat, paths, etc) - RDD.mapPartitions lets you operate in all the values on one partition (block)