Are all the csv files in the same directory?

Mohammed
Author: Big Data Analytics with 
Spark<http://www.amazon.com/Big-Data-Analytics-Spark-Practitioners/dp/1484209656/>

From: saif.a.ell...@wellsfargo.com [mailto:saif.a.ell...@wellsfargo.com]
Sent: Monday, February 22, 2016 7:25 AM
To: user@spark.apache.org
Subject: Can we load csv partitioned data into one DF?

Hello all, I am facing a silly data question.

If I have +100 csv files which are part of the same data, but each csv is for 
example, a year on a timeframe column (i.e. partitioned by year),
what would you suggest instead of loading all those files and joining them?

Final target would be parquet. Is it possible, for example, to load them and 
then store them as parquet, and then read parquet and consider all as one?

Thanks for any suggestions,
Saif

Reply via email to