On Sat, Nov 8, 2008 at 2:05 PM, Duncan Murdoch <[EMAIL PROTECTED]> wrote: > On 08/11/2008 11:03 AM, Gabor Grothendieck wrote: >> >> On Sat, Nov 8, 2008 at 9:41 AM, Duncan Murdoch <[EMAIL PROTECTED]> >> wrote: >>> >>> On 08/11/2008 7:20 AM, John Wiedenhoeft wrote: >>>> >>>> Hi there, >>>> >>>> I rejoiced when I realized that you can use Perl regex from within R. >>>> However, as the FAQ states "Some functions, particularly those involving >>>> regular expression matching, themselves use metacharacters, which may >>>> need >>>> to be escaped by the backslash mechanism. In those cases you may need a >>>> quadruple backslash to represent a single literal one. " >>>> >>>> I was wondering if that is really necessary for perl=TRUE? wouldn't it >>>> be >>>> possible to parse a string differently in a regex context, e.g. >>>> automatically insert \\ for each \ , such that you can use the perl >>>> syntax >>>> directly? For example, if you want to input a newline as a character, >>>> you >>>> would use \n anyway. At the moment one says \\n to make it clear to R >>>> that >>>> you mean \n to make clear that you mean newline... this is pretty >>>> annoying. >>>> How likely is it that you want to pass a real newline character to PCRE >>>> directly? >>> >>> No, that's not possible. At the level where the parsing takes place R >>> has >>> no idea of its eventual use, so it can't tell that some strings are going >>> to >>> be interpreted as Perl, and others not. >>> >>> As Gabor mentioned, there have been various discussions of adding a new >>> syntax for strings that are parsed literally, without processing any >>> escapes, but no consensus on the right syntax to use. >>> >>> There are currently some fragile tricks that let you avoid escapes, e.g. >>> using scan() to read a line: >>> >>>> re <- scan(what="", n=1) >>> >>> 1: [^\\] >>> Read 1 item >>>> >>>> re >>> >>> [1] "[^\\\\]" >>> >>> (I call this fragile because it works in scripts processed at console >>> level, >>> but not if you type the same thing into a function.) >>> >>> So I agree, it would be nice to have new syntax to allow this. Last time >>> this came up, I argued for something like \verb in LaTeX where the >>> delimiter >>> could be specified differently in each use. Duncan TL suggested triple >>> quotes, as in Python. I think now that triple quotes would be be better >>> than the particular form I suggested. >> >> Ruby's quoting method looks quite flexible: >> >> http://en.wikibooks.org/wiki/Ruby_Programming/Alternate_quotes > > Thanks, I didn't know about those. I would have preferred Ruby's option to > the one I made up when we last had this discussion, but it also suffers from > the same flaw: it won't work in Rd files. There the % sign is a comment > marker. Saying that sometimes it's not just makes everything more > complicated. > > So right now I'd have to say that Python-style quotes would be my choice. > If you want to put '''""" into your string, you'll be stuck using regular > quotes and escapes, but I could live with that. > > Duncan Murdoch >
One could use a different character. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.