Hi,
May be library(Biostrings) from Bioconductor helps you. source("http://bioconductor.org/biocLite.R") biocLite("Biostrings") ?matchPattern() ?letterFrequency() vec1<- "ababbbassdaa" alphabetFrequency(DNAString(vec1)) #A C G T M R W S Y K V H D B N - + #5 0 0 0 0 0 0 2 0 0 0 0 1 4 0 0 0 letterFrequency(DNAStringSet(vec1),letters="AC",OR=0) # A C #[1,] 5 0 vec2<- "addffggssbbsbbs" longestConsecutive(c(vec1,vec2),"b") #[1] 3 2 matchPattern(DNAString("AB"),DNAString(vec1)) # Views on a 12-letter DNAString subject #subject: ABABBBASSDAA #views: # start end width #[1] 1 2 2 [AB] #[2] 3 4 2 [AB] Also, library(seqinr) lapply(seq(s2c(vec2)),function(i) table(splitseq(s2c(vec2),word=i))) #[[1]] # #a b d f g s #1 4 2 2 2 4 # #[[2]] # #ad bb bs df fg gs sb # 1 1 1 1 1 1 1 --------------------------------------- A.K. ----- Original Message ----- From: ben1983 <ben_thomp...@talk21.com> To: r-help@r-project.org Cc: Sent: Friday, April 19, 2013 7:21 AM Subject: [R] Sequence analysis Hiya, I am trying to look at the similarities between a number of sequences, for example i am trying to see how similar "ababbbassdaa" is to "addffggssbbsbbs" I was wondering is the some way for me to see how similar they are in terms of, for example, number of a's, number of b's, how often a and ab are consecutive, how often abab is together etc. Any advice would be really useful......any kind of shove in the right direction would be amazing! I've tried doing basic alignments but i think this is loosing quite a lot of information. Many thanks, Ben -- View this message in context: http://r.789695.n4.nabble.com/Sequence-analysis-tp4664693.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.