I am currently working on a project in which we are dealing with TBs of CSV files generating yearly. The main source of data is interval meter data, and the others are Customer information, tariff and sites information. We might be loading the data since 2010. One CVS(10 GBs min) file for each year which could be joined with Customer information, tariff and sites information. AS to visualization tool we might use Power BI or Grafana so we are dealing with interactive queries for example displaying interval data every 30 mins through these tools. I know we can use spark to process and clean the data and create dataframe or dataset our of it, but since we are going to run interactive queries the question is where to land the data after spark job? there should be a stage for landing data between spark and power BI right? Best Regards ....................................................... Amin Mohebbi PhD candidate in Software Engineering at university of Malaysia Tel : +60 18 2040 017 E-Mail : tp025...@ex.apiit.edu.my amin_...@me.comd