Hi All, I am using caching in my code. I have a DF like val DF1 = read csv. val DF2 = DF1.groupBy().agg().select(.....)
Val DF3 = read csv .join(DF1).join(DF2) DF3 .save. If I do not cache DF2 or Df1 it is taking longer time . But i am doing 1 action only why do I need to cache. Thanks Amit