Dear Peter, thank you for the suggestion. Unfortunately the star did not help. Did it work for you? For me it seems incomplete somehow. Laetitia
________________________________________ From: Peter Ehlers [ehl...@ucalgary.ca] Sent: Tuesday, January 12, 2010 09:54 AM To: Laetitia Schmid Cc: Steve Lianoglou; r-help@r-project.org Subject: Re: [R] apply a function down each column See inline below. Laetitia Schmid wrote: > Dear Steve, > my solution looks like it would work, but it does not. > I attached a text file with an extract of my data. Maybe you can try it > yourself. I want to compare C1 with M1, C2 with M2, C3 with M3,,, for > each column. > I do not really know what the problem is. R complains about a syntax error. > The function I am applying counts the common strings between the two. > Greg Hirson helped me to write it. > > lettermatch <- function(a, b) { > tb <- merge(as.data.frame(table(strsplit(a, ""))), > as.data.frame(table(strsplit(b, ""))), by="Var1") > sum(apply(tb[-1], 1, min)) > } > > For example for the second column I tried: > > for (x in 1:(nrow(dat)-1)) { > a <- as.character(dat[(2x-1),1]) Shouldn't that be 2*x-1?? -Peter Ehlers > b <- as.character(dat[(2x),1]) > lettermatch(a,b) > } > > or > > a <- as.character(dat[seq(1, nrow(dat), by=2),2]) > b <- as.character(dat[seq(2, nrow(dat), by=2), 2]) > all.results <- lettermatch(a,b) > > With "dat<-read.delim("data_lgs.txt",stringsAsFactors=FALSE)" I can > leave the "as.character" away in the formula above. > > Laetitia > > Individuals Seq1 Seq2 Seq3 Seq4 > C1 GGGG AATT CCGG CTTT > M1 GGGG AAAA GGGG GGGG > C2 GGGG AATT CCGG CTTT > M2 AGGG AACT CCGG CGTT > C3 AGGG AACT CCGG CGTT > M3 AGGG AACT CCGG CGTT > C4 GGGG AATT CCGG CCTT > M4 GGGG AAAT CGGG CTTT > C5 AGGG ACTT CCCG CTTT > M5 AGGG CTTT CCCC CCTT > C6 AGGG CTTT CCCC CCTT > M6 AAAG CCTT CCCC CTTT > C7 AAAG ACCC CCCG GTTT > M7 AAGG AACC CCGG TTTT > C8 GGGG AATT CCGG CCTT > M8 GGGG AATT CCGG CCTT > C9 GGGG AAAA GGGG TTTT > M9 GGGG AAAA GGGG TTTT > C11 AGGG AAAC CGGG GGTT > M11 GGGG AATT CCGG CCTT > > > > Am 11.01.2010 um 15:18 schrieb Steve Lianoglou: > >> Hi, >> >> On Mon, Jan 11, 2010 at 8:41 AM, Laetitia Schmid <laeti...@gmt.su.se> >> wrote: >>> Hello World, >>> I have a function that makes pairwise comparisons between two >>> strings. I would like to apply this function to my data (which >>> consists of columns with different strings) in the way that it >>> compares the first with the second entry, and then the third with the >>> fourth, and then the fifth with the sixth, and so on down each column... >>> So (2x-1) and (2x) would be the different entries to be compared! >>> >>> dat= my data: >>> >>> for the first column: compare dat[(2x-1),1] with dat[(2x),1] and x >>> would be 1:i, i=length(dat[,1]) >>> >>> I think the best way to do that is a loop: >>> >>> a <- as.character(dat[(2x-1),1]) >>> b <- as.character(dat[(2x),1]) >>> >>> for (i in 1:length(dat[,1]) my_function(a, b)) >>> >>> Can somebody help me to apply a function with a loop in the way I >>> want to a column? >> >> It seems as if you got it already, don't you? >> >> for (x in 1:(nrow(dat)-1)) { >> a <- dat[(2x-1),1] >> b <- dat[(2x), 1] >> my_function(a,b) >> } >> >>> Is there a specification of "tapply" for that? >> >> I don't think so, but depending on what you want to do, the size of >> your data, and the amount of RAM you have, it might be faster to >> compare everything "at once" (assuming `my_function` can be >> vectorized), for instance: >> >> a <- dat[seq(1, nrow(dat), by=2),1] >> b <- dat[seq(2, nrow(dat), by=2), 1] >> all.results <- my_function(a,b) >> >> Also, as an aside, I see you keep calling "as.character" on your data >> when you extract it from your data.frame. Is your data being converted >> to factors? You can look to set stringsAsFactors=FALSE if this is the >> case and you are reading in data using read.table/delim/etc (see: >> ?read.table) >> >> Hope that helps, >> >> -steve >> >> -- >> Steve Lianoglou >> Graduate Student: Computational Systems Biology >> | Memorial Sloan-Kettering Cancer Center >> | Weill Medical College of Cornell University >> Contact Info: http://cbio.mskcc.org/~lianos/contact > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > -- Peter Ehlers University of Calgary 403.202.3921 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.