> -----Original Message----- > From: r-help-boun...@r-project.org > [mailto:r-help-boun...@r-project.org] On Behalf Of Waverley @ > Palo Alto > Sent: Sunday, August 22, 2010 3:51 PM > To: r-help > Subject: Re: [R] how to implement string pattern extraction in R > > Thanks for the reply to pointing me to the grep functions. > > I have checked the readme page > http://pbil.univ-lyon1.fr/library/base/html/grep.html before I sent > the help request. > > Just don't know how to extract a substring matching a pattern out of a > string. Can someone give me the example code similar to that in perl > to extract the prefix out of the string.
The S language pattern matching functions are vectorized so let's compare the S way to the vectorized version of your perl code. I think the following is idiomatic perl: @x=qw(AAAA.txt BBBB.qaz CCCC.txt); @prefixes=map { if($_ =~ /(.*?)\.txt/) { $1 ; } else { "<not txt file>"; } } @x ; print( join(", ", @prefixes), "\n") ; ^Z # or ^D on Unix AAAA, <not txt file>, CCCC The S equivalent to the @x=qw(...) would be > x <- c("AAAA.txt", "BBBB.qaz", "CCCC.txt") and to get the part before the ".txt", if there is a ".txt" at the end you could do one of > ifelse(grepl("\\.txt$", x), sub(pattern="\\.txt$",replacement="",x), "<not txt file>") [1] "AAAA" "<not txt file>" "CCCC" or > ifelse((r <- regexpr("\\.txt$", x))>0, substring(x, 1, attr(r, "match.length")), "<not txt file>") [1] "AAAA" "<not txt file>" "CCCC" perl's =~ has a return value that says if there was a match or not and it stores the details of the match in the magic variables $1, $2, ... (and $', $`, and $&). S language functions don't use magic variables but can store the extra stuff as attributes of the return value. (The above use core R or S+ functions. The gsubfn package offers more possibilities.) Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com > > Thanks much. > > On Sun, Aug 22, 2010 at 3:05 PM, Waverley @ Palo Alto > <waverley.paloa...@gmail.com> wrote: > > Hi, > > > > In perl, to get a substring matching a particular pattern can be > > implemented like the following example: > > > > $x = "AAAA.txt"; > > if ($x=~ /(.*?)\.txt/){ > > $prefix = $1; > > } > > > > So how to do the same thing in R? > > > > Can someone provide me the code sample? > > > > Thanks much in advance. > > > > -- > > Waverley @ Palo Alto > > > > > > -- > Waverley @ Palo Alto > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.