According to the documentation they are exactly the same, but in my queries

dataFrame.cache()

results in much faster execution times vs doing

sqlContext.cacheTable("tableName")

Is there any explanation about this? I am not caching the RDD prior to
creating the dataframe. Using Pyspark on Spark 1.5.2

Kind regards,
George

Reply via email to