Hi Stefan! :-)

From tools where negative lookbehind can involve variable lengths, one
would think this would work:

grep("(?<!(?:\\1|^))(.)\\1{1,}$", vec, perl=T)

But then R doesn't like it that much ...

It's really the PCRE library that doesn't like your regexp, not R. The problem is that negative behind is only possible with a fixed- length expression, and since \1 may hold an arbitrary string, the PCRE library can't be sure it's just a single character. I'm also surprised that you're allowed to use \1 before defining it.

But is there a one-line grep thingy to do this?

Can't think of a one-liner, but a three-line solution you can easily enough wrap in a small function:

vec<-c("aaaa", "baaa", "bbaa", "bbba", "baamm", "aa")
idx.1 <- grep("(.)\\1$", vec)
idx.2 <- grep("^(.)\\1*$", vec)
vec[setdiff(idx.1, idx.2)]

Cheers,
Stefan



--
The wonders of Googleology (episode 1)

"from collectibles to cars"
        84,700,000 -- Google
        9,443,672 -- Google N-grams (Web 1T5)
        1 -- ukWaC

[ [EMAIL PROTECTED] | http://purl.org/stefan.evert ]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to