FWIW: I think Jim makes an excellent point -- regex's really aren't the right tool for this sort of thing (imho); matching is.
Note also that if one is willing to live with a logical response (better, again imho), then the ifelse() can of course be dispensed with: > CRC$MMR.gene<-CRC$gene.all %in% match_strings > CRC$MMR.gene [1] TRUE FALSE TRUE FALSE Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Thu, May 27, 2021 at 8:35 PM Jim Lemon <drjimle...@gmail.com> wrote: > Hi Kai, > You may find %in% easier than grep when multiple matches are needed: > > match_strings<-c("MLH1","MSH2") > CRC<-data.frame(gene.all=c("MLH1","MSL1","MSH2","MCC3")) > CRC$MMR.gene<-ifelse(CRC$gene.all %in% match_strings,"Yes","No") > > Composing your match strings before applying %in% may be more flexible > if you have more than one selection to make. > > On Fri, May 28, 2021 at 1:57 AM Marc Schwartz via R-help > <r-help@r-project.org> wrote: > > > > Hi, > > > > A quick clarification: > > > > The regular expression is a single quoted character vector, not a > > character vector on either side of the | operator: > > > > "MLH1|MSH2" > > > > not: > > > > "MLH1"|"MSH2" > > > > The | is treated as a special character within the regular expression. > > See ?regex. > > > > grep(), when value = FALSE, returns the index of the match within the > > source vector, while when value = TRUE, returns the found character > > entries themselves. > > > > Thus, you need to be sure that your ifelse() incantation is matching the > > correct values. > > > > In the case of grepl(), it returns TRUE or FALSE, as Rui noted, thus: > > > > CRC$MMR.gene <- ifelse(grepl("MLH1|MSH2",CRC$gene.all), "Yes", "No") > > > > should work. > > > > Regards, > > > > Marc Schwartz > > > > > > Kai Yang via R-help wrote on 5/27/21 11:23 AM: > > > Hi Rui,thank you for your suggestion. > > > but when I try the solution, I got message below: > > > > > > Error in "MLH1" | "MSH2" : operations are possible only for numeric, > logical or complex types > > > > > > does it mean, grepl can not work on character field? > > > Thanks,Kai On Thursday, May 27, 2021, 01:37:58 AM PDT, Rui Barradas > <ruipbarra...@sapo.pt> wrote: > > > > > > Hello, > > > > > > ifelse needs a logical condition, not the value. Try grepl. > > > > > > > > > CRC$MMR.gene <- ifelse(grepl("MLH1"|"MSH2",CRC$gene.all), "Yes", "No") > > > > > > > > > Hope this helps, > > > > > > Rui Barradas > > > > > > Às 05:29 de 27/05/21, Kai Yang via R-help escreveu: > > >> Hi List, > > >> I wrote the code to create a new variable: > > >> > CRC$MMR.gene<-ifelse(grep("MLH1"|"MSH2",CRC$gene.all,value=T),"Yes","No") > > >> > > >> > > >> I need to create MMR.gene column in CRC data frame, ifgene.all column > contenes MLH1 or MSH2, then the MMR.gene=Yes, if not,MMR.gene=No > > >> > > >> But, the code doesn't work for me. Can anyone tell how to fix the > code? > > >> > > >> Thank you, > > >> > > >> Kai > > > > ______________________________________________ > > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.