On Fri, Aug 26, 2011 at 7:27 AM, Jeff Newmiller <jdnew...@dcn.davis.ca.us> wrote: > ".*" is greedy... might want regex "number[^0-9]*([0-9] {4})" to avoid > getting 1999 from "I want the number 2000, not the number 1999."
If such inputs are possible we could also do this where we have added a ? after the * to make the repetition non-greedy and also have used simplify=unlist and ended it with [1] to get only the first match since it will otherwise match and return all occurrences: strapply(mytext, "number.*?([0-9]{4})", as.numeric, simplify = unlist)[1] # 2000 -- Statistics & Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.