Thanks, Marcelo. It works!
2014-07-31 5:37 GMT+08:00 Marcelo Vanzin <van...@cloudera.com>: > Hi Fengyun, > > Have you tried to use saveAsHadoopFile() (or > saveAsNewAPIHadoopFile())? You should be able to do something with > that API by using AvroKeyValueOutputFormat. > > The API is defined here: > > http://spark.apache.org/docs/1.0.0/api/scala/#org.apache.spark.rdd.PairRDDFunctions > > Lots of RDD types include that functionality already. > > > On Wed, Jul 30, 2014 at 2:14 AM, Fengyun RAO <raofeng...@gmail.com> wrote: > > We used mapreduce for ETL and storing results in Avro files, which are > > loaded to hive/impala for query. > > > > Now we are trying to migrate to spark, but didn't find a way to write > > resulting RDD to Avro files. > > > > I wonder if there is a way to make it, or if not, why spark doesn't > support > > Avro as well as mapreduce? Are there any plans? > > > > Or what's the recommended way to output spark results with schema? I > don't > > think plain text is a good choice. > > > > -- > Marcelo >