On 23/07/2010 12:10 PM, babyfoxlo...@sina.com wrote:
Hi there,
Sorry to bother those who are not interested in this problem.
I'm dealing with a large data set, more than 6 GB file, and doing regression
test with those data. I was wondering are there any efficient ways to read
those data? Instead of just using read.table()? BTW, I'm using a 64bit version
desktop and a 64bit version R, and the memory for the desktop is enough for me
to use.
Thanks.
You probably won't get much faster than read.table with all of the
colClasses specified. It will be a lot slower if you leave that at the
default NA setting, because then R needs to figure out the types by
reading them as character and examining all the values. If the file is
very consistently structured (e.g. the same number of characters in
every value in every row) you might be able to write a C function to
read it faster, but I'd guess the time spent writing that would be a lot
more than the time saved.
Duncan Murdoch
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.