Barry Rowlingson wrote on 09/27/2011 12:06:21 PM:
> 
> On Tue, Sep 27, 2011 at 5:51 PM, Marcelo Araya <marcelo...@gmail.com> 
wrote:
> > Hi all
> >
> >
> >
> > I am analyzing bird song element sequences. I would like to know how 
can I
> > get how many times a given subsequence is found in single string 
sequence.
> >
> >
> >
> >
> >
> > For example:
> >
> >
> >
> > If I have this single sequence:
> >
> >
> >
> > ABCABAABABABCAB
> >
> >
> >
> > I am looking for the subsequence "ABC". Want I need to get here is 
that the
> > subsequence is found twice.
> >
> >
> >
> > Any idea how can I do this?
> >
> 
>  gregexpr will return the position and length of multiple matches. And
> you can feed it a vector. So:
> 
> 
>  > songs=c("ABCABAABABABCAB","ABACAB","ABABCABCBC")
>  > gregexpr(m,songs)
> [[1]]
> [1]  1 11
> attr(,"match.length")
> [1] 3 3
> 
> [[2]]
> [1] -1
> attr(,"match.length")
> [1] -1
> 
> [[3]]
> [1] 3 6
> attr(,"match.length")
> [1] 3 3
> 
> - in the first item, it was found at posn 1 and 11
>  - in the second it wasnt found at all
>  - in the third, it was found at posn 3 and 6
> 
>  so just do some apply-ing to the returned list and get the length of
> each element. Job done!
> 
> Barry
> 
> PS bonus points for spotting the hidden prog-rock song title.


For example,

songs <- c("ABCABAABABABCAB", "ABACAB", "ABABCABCBC")
counts <- gregexpr("ABC", songs)
sapply(counts, length)

Jean

P.S.  1981 Genesis album!
        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to