SchemaRDDs, provided by Spark SQL, have a saveAsParquetFile command. You can turn a normal RDD into a SchemaRDD using the techniques described here: http://spark.apache.org/docs/latest/sql-programming-guide.html
This should work with Impala, but if you run into any issues please let me know. On Sun, Jul 6, 2014 at 5:30 PM, Shaikh Riyaz <shaikh....@gmail.com> wrote: > Hi, > > We are planning to use spark to load data to Parquet and this data will be > query by Impala for present visualization through Tableau. > > Can we achieve this flow? How to load data to Parquet from spark? Will > impala be able to access the data loaded by spark? > > I will greatly appreciate if someone can help with the example to achieve > the goal. > > Thanks in advance. > > -- > Regards, > > Riyaz > >