sqlContext.cacheTable("tableName") vs dataFrame.cache()

George Sigletos Fri, 15 Jan 2016 05:01:28 -0800

According to the documentation they are exactly the same, but in my queries


dataFrame.cache()

results in much faster execution times vs doing

sqlContext.cacheTable("tableName")

Is there any explanation about this? I am not caching the RDD prior to
creating the dataframe. Using Pyspark on Spark 1.5.2

Kind regards,
George

sqlContext.cacheTable("tableName") vs dataFrame.cache()

Reply via email to