Hi everyone,

I have two Spark jobs inside a Spark Application, which read from the same
input file.
They are executed in 2 threads.

Right now, I cache the input file into memory before executing these two
jobs.

Are there another ways to share their same input with just only one read?
I know there is something called Multiple Query Optimization, but I don't
know if it can be applicable on Spark (or SparkSQL) or not?

Thank you.

Quang-Nhat

Reply via email to