I'm not 100% sure what you wanna do though, how about caching whole data and then querying? yourRdd.cache.foreach(_)
On Fri, Mar 25, 2016 at 12:22 AM, Daniel Imberman <daniel.imber...@gmail.com > wrote: > Hi Takeshi, > > Thank you for getting back to me. If this is not possible then perhaps you > can help me with the root problem that caused me to ask this question. > > Basically I have a job where I'm loading/persisting an RDD and running > queries against it. The problem I'm having is that even though there is > plenty of space in memory, the RDD is not fully persisting. Once I run > multiple queries against it the RDD fully persists, but this means that the > first 4/5 queries I run are extremely slow. > > Is there any way I can make sure that the entire RDD ends up in memory the > first time I load it? > > Thank you > > On Thu, Mar 24, 2016 at 1:21 AM Takeshi Yamamuro <linguin....@gmail.com> > wrote: > >> just re-sent, >> >> >> ---------- Forwarded message ---------- >> From: Takeshi Yamamuro <linguin....@gmail.com> >> Date: Thu, Mar 24, 2016 at 5:19 PM >> Subject: Re: Forcing data from disk to memory >> To: Daniel Imberman <daniel.imber...@gmail.com> >> >> >> Hi, >> >> We have no direct approach; we need to unpersist cached data, then >> re-cache data as MEMORY_ONLY. >> >> // maropu >> >> On Thu, Mar 24, 2016 at 8:22 AM, Daniel Imberman < >> daniel.imber...@gmail.com> wrote: >> >>> Hi all, >>> >>> So I have a question about persistence. Let's say I have an RDD that's >>> persisted MEMORY_AND_DISK, and I know that I now have enough memory space >>> cleared up that I can force the data on disk into memory. Is it possible >>> to >>> tell spark to re-evaluate the open RDD memory and move that information? >>> >>> Thank you >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Forcing-data-from-disk-to-memory-tp26585.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org >>> For additional commands, e-mail: user-h...@spark.apache.org >>> >>> >> >> >> -- >> --- >> Takeshi Yamamuro >> >> >> >> -- >> --- >> Takeshi Yamamuro >> > -- --- Takeshi Yamamuro