subject:"Lazyoutput format in spark"

Re: Lazyoutput format in spark

2014-03-02 Thread Matei Zaharia

You can probably use LazyOutputFormat directly. If there’s one for the hadoop.mapred API, you can use it with PairRDDFunctions.saveAsHadoopRDD() today, otherwise there’s going to be a version of that for the hadoop.mapreduce API as well in Spark 1.0. Matei On Feb 28, 2014, at 5:18 PM, Mohit Si

Lazyoutput format in spark

2014-02-28 Thread Mohit Singh

Hi, Is there something equivalent of LazyOutputFormat equivalent in spark (pyspark) http://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/lib/output/LazyOutputFormat.html Basically, something like where I only save files which has some data in it rather than saving all the files as