Hi All,

I am currently loading 3B (20GB) events in my algorithm for processing. I am 
reading this data from postgresXL DB cluster (1 coordinator+4 datanodes (8cpu 
61GB 200GB machines each)) total 1TB of space.

The whole data loading is taking too much time almost 5days before I can start 
running my algorithms.

Can you please help me in suggesting right technology to choose for inputting 
data? So clearly DB is the bottleneck right now

Should I move away from postgresXL ? Which is most suitable options DB, File, 
Paraquet File to load data efficiently in R?

Look forward to your responses

Thanks
Prerna

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to