Hi,
I have a spark app which internally splits into 2 jobs coz we write to 2
different cassandra tables. The input data comes from the same cassandra
table, so after reading data from cassandra and apply few transformations I
cache one of the RDD and fork the program to compute both the metrics.
I
There was a discussion happened on that earlier, let me re-post it for you.
For the following code:
val *df* = sqlContext.parquetFile(path)
*df* remains columnar (actually it just reads from the columnar Parquet
file on disk).
For the following code:
val *cdf* = df.cache()
*cdf* is
Hi Akhil,
It's interesting if RDDs are stored internally in a columnar format as well?
Or it is only when an RDD is cached in SQL context, it is converted to
columnar format.
What about data frames?
Thanks!
--
Ruslan Dautkhanov
On Fri, Jul 10, 2015 at 2:07 AM, Akhil Das
wrote:
>
> https://s
https://spark.apache.org/docs/latest/sql-programming-guide.html#caching-data-in-memory
Thanks
Best Regards
On Fri, Jul 10, 2015 at 10:05 AM, vinod kumar
wrote:
> Hi Guys,
>
> Can any one please share me how to use caching feature of spark via spark
> sql queries?
>
> -Vinod
>
Hi Guys,
Can any one please share me how to use caching feature of spark via spark
sql queries?
-Vinod
k-user-list.1001560.n3.nabble.com/How-to-use-caching-in-Spark-Actions-or-Output-operations-tp23549p23641.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: user-un