Running R 3.1.1 on windows 7 I want to identify as a case any record in a dataframe that contains any of several keywords in any of several variables.
Example: # create a dataframe with 4 variables and 10 records v2 <- c("white bird", "blue bird", "green turtle", "quick brown fox", "big black dog", "waffle the hamster", "benny likes food a lot", "hello world", "yellow giraffe with a long neck", "black bear") v3 <- c("harry potter", "hermione grainger", "ronald weasley", "ginny weasley", "dudley dursley", "red sparks", "blue sparks", "white dress robes", "gandalf the white", "gandalf the grey") zz <- data.frame(v1=rnorm(10), v2=v2, v3=v3, v4=rpois(10, lambda=2), stringsAsFactors=FALSE) str(zz) zz # here are the keywords alarm.words <- c("red", "green", "turtle", "gandalf") # For each row/record, I want to test whether the string in v2 or the string in v3 contains any of the strings in alarm.words. And then if so, set zz$v5=TRUE for that record. # I'm thinking the str_detect function in the stringr package ought to be able to help, perhaps with some use of apply over the rows, but I obviously misunderstand something about how str_detect works library(stringr) str_detect(zz[,2:3], alarm.words) # error: the target of the search # must be a vector, not multiple # columns str_detect(zz[1:4,2:3], alarm.words) # same error str_detect(zz[,2], alarm.words) # error, length of alarm.words # is less than the number of # rows I am using for the # comparison str_detect(zz[1:4,2], alarm.words) # works as hoped when length(alarm.words) # confining nrows # to the length of alarm.words str_detect(zz, alarm.words) # obviously not right # maybe I need apply() ? my.f <- function(x){str_detect(x, alarm.words)} apply(zz[,2], 1, my.f) # again, a mismatch in lengths # between alarm.words and that # in which I am searching for # matching strings apply(zz, 2, my.f) # now I'm getting somewhere apply(zz[1:4,], 2, my.f) # but still only works with 4 # rows of the dataframe # perhaps %in% could do the job? Appreciate any advice. --Chris Ryan ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.