subject:"sqlContext.cacheTable\(\"tableName\"\) vs dataFrame.cache\(\)"

Re: sqlContext.cacheTable("tableName") vs dataFrame.cache()

2016-01-19 Thread Jerry Lam

Is cacheTable similar to asTempTable before? Sent from my iPhone > On 19 Jan, 2016, at 4:18 am, George Sigletos wrote: > > Thanks Kevin for your reply. > > I was suspecting the same thing as well, although it still does not make much > sense to me why would you need to do both: > myData.cach

Re: sqlContext.cacheTable("tableName") vs dataFrame.cache()

2016-01-19 Thread George Sigletos

Thanks Kevin for your reply. I was suspecting the same thing as well, although it still does not make much sense to me why would you need to do both: myData.cache() sqlContext.cacheTable("myData") in case you are using both sqlContext and dataframes to execute queries dataframe.select(...) and s

Re: sqlContext.cacheTable("tableName") vs dataFrame.cache()

2016-01-15 Thread Kevin Mellott

Hi George, I believe that sqlContext.cacheTable("tableName") is to be used when you want to cache the data that is being used within a Spark SQL query. For example, take a look at the code below. > val myData = sqlContext.load("com.databricks.spark.csv", Map("path" -> > "hdfs://somepath/file", "

sqlContext.cacheTable("tableName") vs dataFrame.cache()

2016-01-15 Thread George Sigletos

According to the documentation they are exactly the same, but in my queries dataFrame.cache() results in much faster execution times vs doing sqlContext.cacheTable("tableName") Is there any explanation about this? I am not caching the RDD prior to creating the dataframe. Using Pyspark on Spark

Re: sqlContext.cacheTable("tableName") vs dataFrame.cache()

Re: sqlContext.cacheTable("tableName") vs dataFrame.cache()

Re: sqlContext.cacheTable("tableName") vs dataFrame.cache()

sqlContext.cacheTable("tableName") vs dataFrame.cache()

4 matches

Site Navigation

Mail list logo

Footer information