something like this should work…. val df = sparkSession.read.csv(“myfile.csv”) //you may have to provide a schema if the guessed schema is not accurate df.write.parquet(“myfile.parquet”)
Mohit Jaggi Founder, Data Orchard LLC www.dataorchardllc.com > On Apr 27, 2014, at 11:41 PM, Sai Prasanna <ansaiprasa...@gmail.com> wrote: > > Hi All, > > I want to store a csv-text file in Parquet format in HDFS and then do some > processing in Spark. > > Somehow my search to find the way to do was futile. More help was available > for parquet with impala. > > Any guidance here? Thanks !! >