Hi Marc. For question 1. I know in Perl that regular expressions when captured can be saved if not overwritten. \\1 is the capture variable in your R examples.
So the 2nd regular expression does not match but \\1 still has 1980 captured from the previous expression, hence the result. Maybe if you restart R and try your 2nd expression first, \\1 will be empty or no match result. Just speculation :) John On 9 Aug 2018 08:58, "Marc Girondot via R-help" <r-help@r-project.org> wrote: > Hi everybody, > > I have some questions about the way that sub is working. I hope that > someone has the answer: > > 1/ Why the second example does not return an empty string ? There is no > match. > > subtext <- "-1980-" > sub(".*(1980).*", "\\1", subtext) # return 1980 > sub(".*(1981).*", "\\1", subtext) # return -1980- > > 2/ Based on sub documentation, it replaces the first occurence of a > pattern: why it does not return 1980 ? > > subtext <- " 1980 1981 " > sub(".*(198[01]).*", "\\1", subtext) # return 1981 > > 3/ I want extract year from text; I use: > > subtext <- "bla 1980 bla" > sub(".*[ \\.\\(-]([12][01289][0-9][0-9])[ \\.\\)-].*", "\\1", subtext) # > return 1980 > subtext <- "bla 2010 bla" > sub(".*[ \\.\\(-]([12][01289][0-9][0-9])[ \\.\\)-].*", "\\1", subtext) # > return 2010 > > but > > subtext <- "bla 1010 bla" > sub(".*[ \\.\\(-]([12][01289][0-9][0-9])[ \\.\\)-].*", "\\1", subtext) # > return 1010 > > I would like exclude the case 1010 and other like this. > > The solution would be: > > 18[0-9][0-9] or 19[0-9][0-9] or 200[0-9] or 201[0-9] > > Is there a solution to write such a pattern in grep ? > > Thanks a lot > > Marc > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posti > ng-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.