Spark uses the Hadoop InputFormat and OutputFormat classes, so you can simply create a JobConf to read the data and pass that to SparkContext.hadoopFile. There are some examples for Parquet usage here: http://zenfractal.com/2013/08/21/a-powerful-big-data-trio/ and here: http://engineering.ooyala.com/blog/using-parquet-and-scrooge-spark.
Matei On Apr 27, 2014, at 11:41 PM, Sai Prasanna <ansaiprasa...@gmail.com> wrote: > Hi All, > > I want to store a csv-text file in Parquet format in HDFS and then do some > processing in Spark. > > Somehow my search to find the way to do was futile. More help was available > for parquet with impala. > > Any guidance here? Thanks !! >