Caching in Spark

2016-01-22 Thread Sourabh Chandak
Hi, I have a spark app which internally splits into 2 jobs coz we write to 2 different cassandra tables. The input data comes from the same cassandra table, so after reading data from cassandra and apply few transformations I cache one of the RDD and fork the program to compute both the metrics. I

Re: Caching in spark

2015-07-12 Thread Akhil Das
There was a discussion happened on that earlier, let me re-post it for you. For the following code: val *df* = sqlContext.parquetFile(path) *df* remains columnar (actually it just reads from the columnar Parquet file on disk). For the following code: val *cdf* = df.cache() *cdf* is

Re: Caching in spark

2015-07-12 Thread Ruslan Dautkhanov
Hi Akhil, It's interesting if RDDs are stored internally in a columnar format as well? Or it is only when an RDD is cached in SQL context, it is converted to columnar format. What about data frames? Thanks! -- Ruslan Dautkhanov On Fri, Jul 10, 2015 at 2:07 AM, Akhil Das wrote: > > https://s

Re: Caching in spark

2015-07-10 Thread Akhil Das
https://spark.apache.org/docs/latest/sql-programming-guide.html#caching-data-in-memory Thanks Best Regards On Fri, Jul 10, 2015 at 10:05 AM, vinod kumar wrote: > Hi Guys, > > Can any one please share me how to use caching feature of spark via spark > sql queries? > > -Vinod >

Caching in spark

2015-07-09 Thread vinod kumar
Hi Guys, Can any one please share me how to use caching feature of spark via spark sql queries? -Vinod

Re: How to use caching in Spark Actions or Output operations?

2015-07-05 Thread Himanshu Mehra
k-user-list.1001560.n3.nabble.com/How-to-use-caching-in-Spark-Actions-or-Output-operations-tp23549p23641.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-un