Not a new approach, but some benchmark data (the perl=TRUE speeds up Jim's suggestion):
> x <- c('18x.6','12x.9','302x.3') > y <- rep(x,100000) > system.time(temp <- unlist(lapply(strsplit(y,".",fixed=TRUE),function(x) >x[1]))) user system elapsed 1.203 0.018 1.222 > system.time(temp2 <- gsub("^(.*?)\\..*$","\\1",y, perl=TRUE)) user system elapsed 0.176 0.001 0.176 > identical(temp2, temp) [1] TRUE > system.time(temp3 <- gsub("^(.*)\\..*", '\\1', y)) user system elapsed 0.292 0.001 0.291 > identical(temp3, temp) [1] TRUE > system.time(temp3 <- gsub("^(.*)\\..*", '\\1', y, perl=TRUE)) user system elapsed 0.160 0.001 0.161 On 5/29/11 7:40 PM, "jim holtman" <jholt...@gmail.com> wrote: >Try this approach: > >> x <- c('18x.6','12x.9','302x.3') >> gsub("^(.*)\\..*", '\\1', x) >[1] "18x" "12x" "302x" > > >On Sun, May 29, 2011 at 8:10 PM, Matthew Keller <mckellerc...@gmail.com> >wrote: >> hi all, >> >> I'm full of questions today :). Thanks in advance for your help! >> >> Here's the problem: >> x <- c('18x.6','12x.9','302x.3') >> >> I want to get a vector that is c('18x','12x','302x') >> >> This is easily done using this code: >> >> unlist(lapply(strsplit(x,".",fixed=TRUE),function(x) x[1])) >> >> So far so good. The problem is that x is a vector of length 132e6. >> When I run the above code, it runs for > 30 minutes, and it takes > 23 >> Gb RAM (no kidding!). >> >> Does anyone have ideas about how to speed up the code above and (more >> importantly) reduce the RAM footprint? I'd prefer not to change the >> file on disk using, e.g., awk, but I will do that as a last resort. >> >> Best >> >> Matt >> >> -- >> Matthew C Keller >> Asst. Professor of Psychology >> University of Colorado at Boulder >> www.matthewckeller.com >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >>http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > >-- >Jim Holtman >Data Munger Guru > >What is the problem that you are trying to solve? > >______________________________________________ >R-help@r-project.org mailing list >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.