This method in CacheManager: private[sql] def lookupCachedData(plan: LogicalPlan): Option[CachedData] = readLock { cachedData.find(cd => plan.sameResult(cd.plan))
Ied me to the following in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LogicalPlan.scala : def sameResult(plan: LogicalPlan): Boolean = { There is detailed comment above this method which should give some idea. Cheers On Fri, Dec 18, 2015 at 9:21 AM, Sahil Sareen <sareen...@gmail.com> wrote: > Thanks Ted! > > Yes, The schema might be different or the same. > What would be the answer for each situation? > > On Fri, Dec 18, 2015 at 6:02 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >> CacheManager#cacheQuery() is called where: >> * Caches the data produced by the logical representation of the given >> [[Queryable]]. >> ... >> val planToCache = query.queryExecution.analyzed >> if (lookupCachedData(planToCache).nonEmpty) { >> >> Is the schema for dfNew different from that of dfOld ? >> >> Cheers >> >> On Fri, Dec 18, 2015 at 3:33 AM, Sahil Sareen <sareen...@gmail.com> >> wrote: >> >>> Spark 1.5.2 >>> >>> dfOld.registerTempTable("oldTableName") >>> sqlContext.cacheTable("oldTableName") >>> // .... >>> // do something >>> // .... >>> dfNew.registerTempTable("oldTableName") >>> sqlContext.cacheTable("oldTableName") >>> >>> >>> Now when I use the "oldTableName" table I do get the latest contents >>> from dfNew but do the contents of dfOld get removed from the memory? >>> >>> Or is the right usage to do this: >>> dfOld.registerTempTable("oldTableName") >>> sqlContext.cacheTable("oldTableName") >>> // .... >>> // do something >>> // .... >>> dfNew.registerTempTable("oldTableName") >>> sqlContext.unCacheTable("oldTableName") <========== unCache the old >>> contents first >>> sqlContext.cacheTable("oldTableName") >>> >>> -Sahil >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >