Re: [R] How to deal with more than 6GB dataset using R?

Duncan Murdoch Fri, 23 Jul 2010 09:37:30 -0700

On 23/07/2010 12:10 PM, babyfoxlo...@sina.com wrote:

&nbsp;Hi there,


Sorry to bother those who are not interested in this problem.

I'm dealing with a large data set, more than 6 GB file, and doing regression 
test with those data. I was wondering are there any efficient ways to read 
those data? Instead of just using read.table()? BTW, I'm using a 64bit version 
desktop and a 64bit version R, and the memory for the desktop is enough for me 
to use.
Thanks.

You probably won't get much faster than read.table with all of thecolClasses specified. It will be a lot slower if you leave that at thedefault NA setting, because then R needs to figure out the types byreading them as character and examining all the values. If the file isvery consistently structured (e.g. the same number of characters inevery value in every row) you might be able to write a C function toread it faster, but I'd guess the time spent writing that would be a lotmore than the time saved.


Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to deal with more than 6GB dataset using R?

Reply via email to