Duncan Murdoch wrote: > On 08/11/2008 11:03 AM, Gabor Grothendieck wrote: >> On Sat, Nov 8, 2008 at 9:41 AM, Duncan Murdoch <[EMAIL PROTECTED]> >> wrote: >>> On 08/11/2008 7:20 AM, John Wiedenhoeft wrote: >>>> Hi there, >>>> >>>> I rejoiced when I realized that you can use Perl regex from within R. >>>> However, as the FAQ states "Some functions, particularly those >>>> involving >>>> regular expression matching, themselves use metacharacters, which >>>> may need >>>> to be escaped by the backslash mechanism. In those cases you may >>>> need a >>>> quadruple backslash to represent a single literal one. " >>>> >>>> I was wondering if that is really necessary for perl=TRUE? wouldn't >>>> it be >>>> possible to parse a string differently in a regex context, e.g. >>>> automatically insert \\ for each \ , such that you can use the perl >>>> syntax >>>> directly? For example, if you want to input a newline as a >>>> character, you >>>> would use \n anyway. At the moment one says \\n to make it clear to >>>> R that >>>> you mean \n to make clear that you mean newline... this is pretty >>>> annoying. >>>> How likely is it that you want to pass a real newline character to >>>> PCRE >>>> directly? >>> No, that's not possible. At the level where the parsing takes place >>> R has >>> no idea of its eventual use, so it can't tell that some strings are >>> going to >>> be interpreted as Perl, and others not. Here's a quick hack to achieve the impossible:
mygrep = function(pattern, text, perl=FALSE, ...) { if (perl) pattern = gsub("\\\\", "\\\\\\\\", pattern) grep(pattern, text, perl=perl, ...) } (text = "lemme \\ it") # [1] "lemme \\ it" nchar(text) # [1] 10 (pattern = "\\") # [1] "\\" nchar(pattern) # [1] 1 grep(pattern, text, perl=TRUE) # can't go, impossible! mygrep(pattern, text, perl=TRUE, value=TRUE) # [1] "lemme \\ it" vQ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.