Re: pandas-like dataframe in spark

2014-09-04 Thread Mohit Jaggi
Thanks Matei. I will take a look at SchemaRDDs. On Thu, Sep 4, 2014 at 11:24 AM, Matei Zaharia wrote: > Hi Mohit, > > This looks pretty interesting, but just a note on the implementation -- it > might be worthwhile to try doing this on top of Spark SQL SchemaRDDs. The > reason is that SchemaRDD

Re: pandas-like dataframe in spark

2014-09-04 Thread Matei Zaharia
Hi Mohit, This looks pretty interesting, but just a note on the implementation -- it might be worthwhile to try doing this on top of Spark SQL SchemaRDDs. The reason is that SchemaRDDs already have an efficient in-memory representation (columnar storage), and can be read from a variety of data