[R] ideas about how to reduce RAM & improve speed in trying to use lapply(strsplit())

Matthew Keller Sun, 29 May 2011 17:11:45 -0700

hi all,

I'm full of questions today :). Thanks in advance for your help!


Here's the problem:
x <- c('18x.6','12x.9','302x.3')

I want to get a vector that is c('18x','12x','302x')

This is easily done using this code:

unlist(lapply(strsplit(x,".",fixed=TRUE),function(x) x[1]))

So far so good. The problem is that x is a vector of length 132e6.
When I run the above code, it runs for > 30 minutes, and it takes > 23
Gb RAM (no kidding!).

Does anyone have ideas about how to speed up the code above and (more
importantly) reduce the RAM footprint? I'd prefer not to change the
file on disk using, e.g., awk, but I will do that as a last resort.

Best

Matt

-- 
Matthew C Keller
Asst. Professor of Psychology
University of Colorado at Boulder
www.matthewckeller.com

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ideas about how to reduce RAM & improve speed in trying to use lapply(strsplit())

Reply via email to