When you mean by process is it two separate spark jobs? Or two stages
within same spark code?

Thanks
Subash

On Wed, 28 Aug 2019 at 19:06, <em...@yeikel.com> wrote:

> Take a look at this article
>
>
>
>
> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/spark-rdd-caching.html
>
>
>
> *From:* Tzahi File <tzahi.f...@ironsrc.com>
> *Sent:* Wednesday, August 28, 2019 5:18 AM
> *To:* user <user@spark.apache.org>
> *Subject:* Caching tables in spark
>
>
>
> Hi,
>
>
>
> Looking for your knowledge with some question.
>
> I have 2 different processes that read from the same raw data table
> (around 1.5 TB).
>
> Is there a way to read this data once and cache it somehow and to use this
> data in both processes?
>
>
>
>
>
> Thanks
>
> --
>
> *Tzahi File*
> Data Engineer
>
> [image: ironSource] <http://www.ironsrc.com/>
>
> *email* tzahi.f...@ironsrc.com
>
> *mobile* +972-546864835
>
> *fax* +972-77-5448273
>
> ironSource HQ - 121 Derech Menachem Begin st. Tel Aviv
> <https://www.google.com/maps/search/121+Derech+Menachem+Begin+st.+Tel+Aviv?entry=gmail&source=g>
>
> *ironsrc.com* <http://www.ironsrc.com/>
>
> [image: linkedin] <https://www.linkedin.com/company/ironsource>[image:
> twitter] <https://twitter.com/ironsource>[image: facebook]
> <https://www.facebook.com/ironSource>[image: googleplus]
> <https://plus.google.com/+ironsrc>
>
> This email (including any attachments) is for the sole use of the intended
> recipient and may contain confidential information which may be protected
> by legal privilege. If you are not the intended recipient, or the employee
> or agent responsible for delivering it to the intended recipient, you are
> hereby notified that any use, dissemination, distribution or copying of
> this communication and/or its content is strictly prohibited. If you are
> not the intended recipient, please immediately notify us by reply email or
> by telephone, delete this email and destroy any copies. Thank you.
>
>
>

Reply via email to