On Wed, Feb 15, 2012 at 06:27:01AM -0800, Martin Morgan wrote: > On 02/14/2012 11:45 PM, Petr Savicky wrote: > >On Wed, Feb 15, 2012 at 02:17:35PM +1000, Redding, Matthew wrote: > >>Hi All, > >> > >> > >>I've been trawling through the documentation and listserv archives on > >>this topic -- but > >>as yet have not found a solution. I'm sure this is pretty simple with R, > >>but I cannot work out how without > >>resorting to ugly nested loops. > >> > >>As far as I can tell, grep, match, and %in% are not the correct tools. > >> > >>Question: > >>given these vectors -- > >>patrn<- c(1,2,3,4) > >>exmpl<- c(3,3,4,2,3,1,2,3,4,8,8,23,1,2,3,4,4,34,4,3,2,1,1,2,3,4) > >> > >>how do I get the desired answer by finding the occurence of the pattern > >>and returning the starting indices: > >>6, 13, 23 > > match(exmpl, patrn) returns indexes that differ by 1 if the sequence > patrn occurs > > n = length(patrn) > r = rle(diff(match(exmpl, patrn)) == 1) > > we're looking for a run of TRUE's of length 3, and can find their ends > (of the runs of diffs) as cumsum(r$length) > > cumsum(r$length)[r$values & r$length == (n - 1)] - (n - 2) > > Seems like there could be edge cases that I'm missing...
Hi Martin: This is a nice solution. In my opinion, it works, whenever "patrn" does not contain duplicates. Petr. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.