I answer myself to the third point:
This pattern is better to get a year:
pattern.year <- ".*\\b(18|19|20)([0-9][0-9])\\b.*"
subtext <- "bla 1880 bla"
sub(pattern.year, "\\1\\2", subtext) # return 1880
subtext <- "bla 1980 bla"
sub(pattern.year, "\\1\\2", subtext) # return 1980
subtext <- "bla 2010 bla"
sub(pattern.year, "\\1\\2", subtext) # return 2010
subtext <- "bla 1010 bla"
sub(pattern.year, "\\1\\2", subtext) # return bla 1010 bla
subtext <- "bla 3010 bla"
sub(pattern.year, "\\1\\2", subtext) # return bla 3010 bla
Marc
Le 09/08/2018 à 09:57, Marc Girondot via R-help a écrit :
Hi everybody,
I have some questions about the way that sub is working. I hope that
someone has the answer:
1/ Why the second example does not return an empty string ? There is
no match.
subtext <- "-1980-"
sub(".*(1980).*", "\\1", subtext) # return 1980
sub(".*(1981).*", "\\1", subtext) # return -1980-
2/ Based on sub documentation, it replaces the first occurence of a
pattern: why it does not return 1980 ?
subtext <- " 1980 1981 "
sub(".*(198[01]).*", "\\1", subtext) # return 1981
3/ I want extract year from text; I use:
subtext <- "bla 1980 bla"
sub(".*[ \\.\\(-]([12][01289][0-9][0-9])[ \\.\\)-].*", "\\1", subtext)
# return 1980
subtext <- "bla 2010 bla"
sub(".*[ \\.\\(-]([12][01289][0-9][0-9])[ \\.\\)-].*", "\\1", subtext)
# return 2010
but
subtext <- "bla 1010 bla"
sub(".*[ \\.\\(-]([12][01289][0-9][0-9])[ \\.\\)-].*", "\\1", subtext)
# return 1010
I would like exclude the case 1010 and other like this.
The solution would be:
18[0-9][0-9] or 19[0-9][0-9] or 200[0-9] or 201[0-9]
Is there a solution to write such a pattern in grep ?
Thanks a lot
Marc
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.