On 15-02-2012, at 15:27, Martin Morgan wrote:

> On 02/14/2012 11:45 PM, Petr Savicky wrote:
>> On Wed, Feb 15, 2012 at 02:17:35PM +1000, Redding, Matthew wrote:
>>> Hi All,
>>> 
>>> 
>>> I've been trawling through the documentation and listserv archives on this 
>>> topic -- but
>>> as yet have not found a solution.  I'm sure this is pretty simple with R, 
>>> but I cannot work out how without
>>> resorting to ugly nested loops.
>>> 
>>> As far as I can tell, grep, match, and %in% are not the correct tools.
>>> 
>>> Question:
>>> given these vectors --
>>> patrn<- c(1,2,3,4)
>>> exmpl<- c(3,3,4,2,3,1,2,3,4,8,8,23,1,2,3,4,4,34,4,3,2,1,1,2,3,4)
>>> 
>>> how do I get the desired answer by finding the occurence of the pattern and 
>>> returning the starting indices:
>>> 6, 13, 23
> 
> match(exmpl, patrn) returns indexes that differ by 1 if the sequence patrn 
> occurs
> 
>  n = length(patrn)
>  r = rle(diff(match(exmpl, patrn)) == 1)
> 
> we're looking for a run of TRUE's of length 3, and can find their ends (of 
> the runs of diffs) as cumsum(r$length)
> 
>  cumsum(r$length)[r$values & r$length == (n - 1)] - (n - 2)
> 
> Seems like there could be edge cases that I'm missing...

Clever.
However it is quite slow.

Berend

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to