Hi all,

We have a workflow that pulls in data from csv files, then originally setup
up of the workflow was to parse the data as it comes in (turn into array),
then store it. This resulted in out of memory errors with larger files (as a
result of increased GC?). 

It turns out if the data gets stored as a string first, then parsed, it
issues does not occur.

Why is that?

Thanks,



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-tp22255.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to