it is pretty enough for me. Thanks ----- Original Message ---- From: Martin Morgan <[EMAIL PROTECTED]> To: joseph <[EMAIL PROTECTED]> Cc: r-help@r-project.org Sent: Friday, February 22, 2008 6:41:41 PM Subject: Re: [R] counting sequence mismatches
One kind of ugly solution > d.f=data.frame(seq1, seq2, stringsAsFactors=FALSE) > d.f[["nMismatch"]] <- with(d.f, { + m <- mapply("!=", strsplit(seq1, ""), strsplit(seq2, "")) + colSums(m) + }) Check out the Bioconductor Biostrings package, especially the version available with the development version of R, for DNA string algorithms. Martin joseph wrote: > Hello > I have 2 columns of short sequences that I would like to compare and count the number of mismatches and record the number of mismatches in a new column. The sequences are part of a data frame that looks like this: > seq1=c("CGGTGTAGAGGAAAAAAAGGAAACAGGAGTTC","CGGTGGTCAGTCTGGGACCTGGGCAGCAGGCT", "CGGGCCTCTCGGCCTGCAGCCCCCAACAGCCA") > seq2=c("AGGTGTAGAGGAAAAAAAGGAAACAGGAGTTC","CAGTGGTCAGTCTGGGACCTGGGCATCAGGCT", "CGGGCCTCTCGGCCTGCAGCCCCCAACAGCCA") > d.f=data.frame(seq1, seq2) > thank you for your help > Joseph > > > > > > > ____________________________________________________________________________________ > Looking for last minute shopping deals? > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ____________________________________________________________________________________ Be a better friend, newshound, and [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.