Re: [R] Re gular Expression help

Wacek Kusnierczyk Sat, 08 Nov 2008 13:29:55 -0800

Wacek Kusnierczyk wrote:
> Gabor Grothendieck wrote:
>   
>> Here are a few more solutions.  x is the input vector
>> of character strings.
>>
>> The first is a slightly shorter version of one of Wacek's.
>> The next three all create an anonymous grouping variable
>> (using sub, substr/gsub and strapply respectively)
>> whose components are "p" and "q" and then tapply
>> is used to separate out the corresponding components
>> of x according to the grouping:
>>
>> sapply(c(p = "^[^pq]*p", q = "^[^pq]*q"), grep, x = x, value = TRUE)
>>
>> tapply(x, sub("^[^pq]*(.).*", "\\1", x), c)
>>
>> tapply(x, substr(gsub("[^pq]", "", x), 1, 1), c)
>>
>> library(gsubfn)
>> tapply(x, strapply(x, "^[^pq]*(.)", simplify = c), c)
>>   
>>     
>
> wow!  cool stuff.  if you're interested in comparing their efficiency,
> source the attached script.
>
>


using lapply with side-effects code should probably be considered bad
practice, so replace lapply with a for loop.  sorry.

vQ

generate = function(n, m) 
        replicate(n, paste(sample(letters, m, replace=TRUE), collapse=""))

tests = list(

        wacek =
        function(data) {
                p = grep("^[^pq]*p", data)
                list(p=data[p], q=data[-p])
        },
        
        gabor1 =
        function(data) 
                sapply(c(p="^[^pq]*p", q="^[^pq]*q"), grep, x=data, value=TRUE),
                
        gabor2 =
        function(data)
                tapply(data, sub("^[^pq]*p(.).*", "\\1", data), c),
        
        gabor3 =
        function(data)
                tapply(data, substr(gsub("[^pq]", "", data), 1, 1), c),
        
        gabor4 =
        { library(gsubfn); function(data)
                tapply(data, strapply(data, "^[^pq]*(.)", simplify=c), c) }
)
        
data = generate(1000,10)
for (name in names(tests)) {
        cat(name, ":\n", sep="")
        print(system.time(replicate(30,tests[[name]](data)))) }

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Re gular Expression help

Reply via email to