Barry Rowlingson wrote on 09/27/2011 12:06:21 PM: > > On Tue, Sep 27, 2011 at 5:51 PM, Marcelo Araya <marcelo...@gmail.com> wrote: > > Hi all > > > > > > > > I am analyzing bird song element sequences. I would like to know how can I > > get how many times a given subsequence is found in single string sequence. > > > > > > > > > > > > For example: > > > > > > > > If I have this single sequence: > > > > > > > > ABCABAABABABCAB > > > > > > > > I am looking for the subsequence "ABC". Want I need to get here is that the > > subsequence is found twice. > > > > > > > > Any idea how can I do this? > > > > gregexpr will return the position and length of multiple matches. And > you can feed it a vector. So: > > > > songs=c("ABCABAABABABCAB","ABACAB","ABABCABCBC") > > gregexpr(m,songs) > [[1]] > [1] 1 11 > attr(,"match.length") > [1] 3 3 > > [[2]] > [1] -1 > attr(,"match.length") > [1] -1 > > [[3]] > [1] 3 6 > attr(,"match.length") > [1] 3 3 > > - in the first item, it was found at posn 1 and 11 > - in the second it wasnt found at all > - in the third, it was found at posn 3 and 6 > > so just do some apply-ing to the returned list and get the length of > each element. Job done! > > Barry > > PS bonus points for spotting the hidden prog-rock song title.
For example, songs <- c("ABCABAABABABCAB", "ABACAB", "ABABCABCBC") counts <- gregexpr("ABC", songs) sapply(counts, length) Jean P.S. 1981 Genesis album! [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.