You are spending most of the time in the loop accessing the dataframe. Put the data you want out to a vector and then process that:
> x <- read.table(textConnection(" Row.ID AgilentProbe GeneSymbol GeneID Exons AgilentStart first.geneid first.exon last.geneid last.exon + 8 1348 A_23_P116898 A2M 2 34 9112685 TRUE TRUE TRUE TRUE + 62 19410 A_23_P95594 NAT1 9 4 18124656 TRUE TRUE TRUE TRUE + 39 10323 A_23_P31798 NAT2 10 2 18302422 TRUE TRUE TRUE TRUE + 21 5353 A_23_P162918 SERPINA3 12 5 94150936 TRUE TRUE FALSE FALSE + 22 9999 A_23_P162913 SERPINA3 12 5 94150800 FALSE FALSE FALSE FALSE + 98 29990 A_32_P151937 SERPINA3 12 5 94150720 FALSE FALSE FALSE TRUE + 33 9516 A_23_P2920 SERPINA3 12 7 94158435 FALSE TRUE FALSE TRUE + 96 29595 A_32_P124727 SERPINA3 12 8 94160018 FALSE TRUE TRUE TRUE + 57 18176 A_23_P80570 AADAC 13 5 153028473 TRUE TRUE TRUE TRUE + 46 16139 A_23_P56529 AAMP 14 9 218838396 TRUE TRUE TRUE TRUE + 18 4438 A_23_P152527 AANAT 15 7 71976911 TRUE TRUE TRUE TRUE + 69 21321 A_24_P172990 AARS 16 18 68845436 TRUE TRUE TRUE TRUE + 82 24747 A_24_P330684 ABAT 18 17 8780872 TRUE TRUE FALSE TRUE"), header=TRUE, as.is=TRUE) > closeAllConnections() > # create character string > p1 <- x$AgilentProbe > test <- x$first.exon > for (i in seq_along(p1)){ + if (!test[i]) p1[i] <- paste(p1[i-1], p1[i], sep=',') + } > cbind(p1) p1 [1,] "A_23_P116898" [2,] "A_23_P95594" [3,] "A_23_P31798" [4,] "A_23_P162918" [5,] "A_23_P162918,A_23_P162913" [6,] "A_23_P162918,A_23_P162913,A_32_P151937" [7,] "A_23_P2920" [8,] "A_32_P124727" [9,] "A_23_P80570" [10,] "A_23_P56529" [11,] "A_23_P152527" [12,] "A_24_P172990" [13,] "A_24_P330684" > On Tue, May 11, 2010 at 12:33 PM, jim holtman <jholt...@gmail.com> wrote: > Instead of looping on each row, try the following > > p1 <- as.character(aga$AP) > # skew by one on the paste > p1 <- ifelse(aga2$first.exon, p1, paste(c("", tail(ags, -1)), aga2$AP, > sep=',')) > > ags <- as.character(aga$AS) > ags <- ifelse(aga2$first.exon, ags, paste(c("", tail(ags, -1)), aga2$AS, > sep=',') > > On Tue, May 11, 2010 at 12:17 PM, Mark Lamias <mlam...@yahoo.com> wrote: > >> R-users, >> >> I have the following piece of code which I am trying to run on a dataframe >> (aga2) with about a half million records. While the code works, it is >> extremely slow. I've read some of the help archives indicating that I >> should allocate space to the p1 and ags1 vectors, which I have done, but >> this doesn't seem to improve speed much. Would anyone be able to provide me >> with advice on how I might be able to speed this up? >> >> >> p1 <- character(dim(aga2)[1]) >> ags <- character(dim(aga2)[1]) >> for (i in 1:dim(aga2)[1]) >> { >> if (aga2$first.exon[i]==TRUE) >> { >> p1[i]<-as.character(aga2[i, "AP"]) >> ags[i]<-as.character(aga2[i, "AS"]) >> >> } >> else >> { >> p1[i]<-paste(p1[i-1], aga2[i, "AP"], sep=",") >> ags[i]<-paste(ags[i-1], aga2[i, "AS"], sep=",") >> } >> } >> >> Thanks. >> >> --Mark Lamias >> >> >> >> [[alternative HTML version deleted]] >> >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >> and provide commented, minimal, self-contained, reproducible code. >> >> > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > -- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.