[R] Parsing large amounts of csv data with limited RAM

Dupuis, Robert Tue, 14 Jul 2015 18:13:19 -0700

I'm relatively new to using R, and I am trying to find a decent solution for my 
current dilemma.


Right now, I am currently trying to parse second data from a 7 months of CSV 
data. This is over 10GB of data, and I've run into some "memory issues" loading 
them all into a single dataset to be plotted. If possible, I'd really like to 
keep both the one second resolution, and all 100 or so columns intact to make 
things easier on myself.

The problem I have is that the machine that is running this script only has 8GB 
of RAM. I've had issues parsing files with lapply, and some sort of csv reader. 
So far I've tried read.csv, readr.read_table, and data.table.fread with only 
fread having any sort of memory management (fread seems to crash on me 
however). The basic approach I am using is as follows:

# Get the data
files = list.files(pattern="*.csv")
set <- lapply(files, function(x) fread(x, header = T, sep = ',')) #replace 
fread with something that can parse csv data

# Handle the data (Do my plotting down here)
...

These processes work with smaller data sets, but I would like to in a worse 
case scenario be able to parse through 1 year data which would be around 20GB.

Thank you for your time,
Robert Dupuis

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Parsing large amounts of csv data with limited RAM

Reply via email to