Wacek Kusnierczyk wrote: > Gabor Grothendieck wrote: > >> Here are a few more solutions. x is the input vector >> of character strings. >> >> The first is a slightly shorter version of one of Wacek's. >> The next three all create an anonymous grouping variable >> (using sub, substr/gsub and strapply respectively) >> whose components are "p" and "q" and then tapply >> is used to separate out the corresponding components >> of x according to the grouping: >> >> sapply(c(p = "^[^pq]*p", q = "^[^pq]*q"), grep, x = x, value = TRUE) >> >> tapply(x, sub("^[^pq]*(.).*", "\\1", x), c) >> >> tapply(x, substr(gsub("[^pq]", "", x), 1, 1), c) >> >> library(gsubfn) >> tapply(x, strapply(x, "^[^pq]*(.)", simplify = c), c) >> >> > > wow! cool stuff. if you're interested in comparing their efficiency, > source the attached script. > >
using lapply with side-effects code should probably be considered bad practice, so replace lapply with a for loop. sorry. vQ
generate = function(n, m) replicate(n, paste(sample(letters, m, replace=TRUE), collapse="")) tests = list( wacek = function(data) { p = grep("^[^pq]*p", data) list(p=data[p], q=data[-p]) }, gabor1 = function(data) sapply(c(p="^[^pq]*p", q="^[^pq]*q"), grep, x=data, value=TRUE), gabor2 = function(data) tapply(data, sub("^[^pq]*p(.).*", "\\1", data), c), gabor3 = function(data) tapply(data, substr(gsub("[^pq]", "", data), 1, 1), c), gabor4 = { library(gsubfn); function(data) tapply(data, strapply(data, "^[^pq]*(.)", simplify=c), c) } ) data = generate(1000,10) for (name in names(tests)) { cat(name, ":\n", sep="") print(system.time(replicate(30,tests[[name]](data)))) }
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.