I guess, this may help in your case? https://spark.apache.org/docs/latest/sql-programming-guide.html#global-temporary-view
Thanks, Muthu On Fri, Jan 20, 2017 at 6:27 AM, ☼ R Nair (रविशंकर नायर) < ravishankar.n...@gmail.com> wrote: > Dear all, > > Here is a requirement I am thinking of implementing in Spark core. Please > let me know if this is possible, and kindly provide your thoughts. > > A user executes a query to fetch 1 million records from , let's say a > database. We let the user store this as a dataframe, partitioned across > the cluster. > > Another user , executed the same query from another session. Is there > anyway that we can let the second user reuse the dataframe created by the > first user? > > Can we have a master dataframe (or RDD) which stores the information about > the current dataframes loaded and matches against any queries that are > coming from other users? > > In this way, we will have a wonderful system which never allows same query > to be executed and loaded again into the cluster memory. > > Best, Ravion >