When you mean by process is it two separate spark jobs? Or two stages within same spark code?
Thanks Subash On Wed, 28 Aug 2019 at 19:06, <em...@yeikel.com> wrote: > Take a look at this article > > > > > https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-rdd-caching.html > > > > *From:* Tzahi File <tzahi.f...@ironsrc.com> > *Sent:* Wednesday, August 28, 2019 5:18 AM > *To:* user <user@spark.apache.org> > *Subject:* Caching tables in spark > > > > Hi, > > > > Looking for your knowledge with some question. > > I have 2 different processes that read from the same raw data table > (around 1.5 TB). > > Is there a way to read this data once and cache it somehow and to use this > data in both processes? > > > > > > Thanks > > -- > > *Tzahi File* > Data Engineer > > [image: ironSource] <http://www.ironsrc.com/> > > *email* tzahi.f...@ironsrc.com > > *mobile* +972-546864835 > > *fax* +972-77-5448273 > > ironSource HQ - 121 Derech Menachem Begin st. Tel Aviv > <https://www.google.com/maps/search/121+Derech+Menachem+Begin+st.+Tel+Aviv?entry=gmail&source=g> > > *ironsrc.com* <http://www.ironsrc.com/> > > [image: linkedin] <https://www.linkedin.com/company/ironsource>[image: > twitter] <https://twitter.com/ironsource>[image: facebook] > <https://www.facebook.com/ironSource>[image: googleplus] > <https://plus.google.com/+ironsrc> > > This email (including any attachments) is for the sole use of the intended > recipient and may contain confidential information which may be protected > by legal privilege. If you are not the intended recipient, or the employee > or agent responsible for delivering it to the intended recipient, you are > hereby notified that any use, dissemination, distribution or copying of > this communication and/or its content is strictly prohibited. If you are > not the intended recipient, please immediately notify us by reply email or > by telephone, delete this email and destroy any copies. Thank you. > > >