Scan Sharing in Spark

Quang-Nhat HOANG-XUAN Tue, 05 May 2015 08:55:53 -0700

Hi everyone,

I have two Spark jobs inside a Spark Application, which read from the same
input file.
They are executed in 2 threads.


Right now, I cache the input file into memory before executing these two
jobs.

Are there another ways to share their same input with just only one read?
I know there is something called Multiple Query Optimization, but I don't
know if it can be applicable on Spark (or SparkSQL) or not?

Thank you.

Quang-Nhat

Scan Sharing in Spark

Reply via email to