Thanks for the help so far. I tried caching but the operation seems to be taking forever. Any tips on how I can speed up this operation?
Also I am not sure case class would work, since different files have different structures (I am parsing a 1GB file right now but there are a few different files that I also need to run this on). -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org