On 08/11/2008 11:03 AM, Gabor Grothendieck wrote:
On Sat, Nov 8, 2008 at 9:41 AM, Duncan Murdoch <[EMAIL PROTECTED]> wrote:
On 08/11/2008 7:20 AM, John Wiedenhoeft wrote:
Hi there,

I rejoiced when I realized that you can use Perl regex from within R.
However, as the FAQ states "Some functions, particularly those involving
regular expression matching, themselves use metacharacters, which may need
to be escaped by the backslash mechanism. In those cases you may need a
quadruple backslash to represent a single literal one. "

I was wondering if that is really necessary for perl=TRUE? wouldn't it be
possible to parse a string differently in a regex context, e.g.
automatically insert \\ for each \ , such that you can use the perl syntax
directly? For example, if you want to input a newline as a character, you
would use \n anyway. At the moment one says \\n to make it clear to R that
you mean \n to make clear that you mean newline... this is pretty annoying.
How likely is it that you want to pass a real newline character to PCRE
directly?
No, that's not possible.  At the level where the parsing takes place R has
no idea of its eventual use, so it can't tell that some strings are going to
be interpreted as Perl, and others not.

As Gabor mentioned, there have been various discussions of adding a new
syntax for strings that are parsed literally, without processing any
escapes, but no consensus on the right syntax to use.

There are currently some fragile tricks that let you avoid escapes, e.g.
using scan() to read a line:

re <- scan(what="", n=1)
1: [^\\]
Read 1 item
re
[1] "[^\\\\]"

(I call this fragile because it works in scripts processed at console level,
but not if you type the same thing into a function.)

So I agree, it would be nice to have new syntax to allow this.  Last time
this came up, I argued for something like \verb in LaTeX where the delimiter
could be specified differently in each use.  Duncan TL suggested triple
quotes, as in Python.  I think now that triple quotes would be be better
than the particular form I suggested.

Ruby's quoting method looks quite flexible:

http://en.wikibooks.org/wiki/Ruby_Programming/Alternate_quotes

Thanks, I didn't know about those. I would have preferred Ruby's option to the one I made up when we last had this discussion, but it also suffers from the same flaw: it won't work in Rd files. There the % sign is a comment marker. Saying that sometimes it's not just makes everything more complicated.

So right now I'd have to say that Python-style quotes would be my choice. If you want to put '''""" into your string, you'll be stuck using regular quotes and escapes, but I could live with that.

Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to