Dear Jim and all, Allow me to ask your expert opinion.
Using the data (16Mb) downloadable from here: http://drop.io/gundalav/asset/test-data-zip It took this long under 1994.070Mhz Cpu Linux, using "write.table" > proc.time() - ptm1 user system elapsed 16581.833 5787.228 21386.064 __MYCODE__ args <- commandArgs(trailingOnly=FALSE) fname <- args[3] dat <- read.delim(fname, header=FALSE); output <- file('output_writetable.txt', 'w') ptm1 <- proc.time() for (i in 1:nrow(dat)) { #cat(dat$V1[i]," ", as.character(dat$V2[i]),"\n", sep="") write.table(cbind(dat$V1[i], as.character(dat$V2[i])), file=output, sep="\t", quote=FALSE, col.names=FALSE, row.names=FALSE) } close(output) proc.time() - ptm1 __END__ Perhaps I misunderstood you. But seems that this is truly slow. Is there a way I can speed it up? On Jan 8, 11:11 pm, "jim holtman" <jholt...@gmail.com> wrote: > Here is one way of doing it. To write out 1 million rows on my system > took 21 seconds. > > > # create some data > > dataSize <- 1e6 > > foo <- runif(dataSize) > > bar <- runif(dataSize) > > n <- 1000 # number of items to write out each time > > output <- file('/output.txt', 'w') > > # now split the indices into groups of 'n' > > index <- split(seq(length(foo)), cut(seq(length(foo)), length(foo) / n, > > labels=FALSE)) > > my.stats(reset=TRUE) > > stats (1) - Rgui : <0.0 0.0> 73738.9 : 185.1MB> for (i in index){ > > + write.table(cbind(foo[i], bar[i]), file=output, sep='\t', > col.names=FALSE, row.names=FALSE) > + }> close(output) > > my.stats('done') > > done (1) - Rgui : <20.7 20.7> 73759.6 : 124.6MB > > > > > > On Thu, Jan 8, 2009 at 8:26 AM,GundalaViswanath <gunda...@gmail.com> wrote: > > Dear Jim and Henrik, > > >> What exactly is the problem you are trying to solve. > >> Is it going to be read by some other program? > > > I simply want to print the data out. Surely, this data > > will be manipulated (with Excel or other > > programming languages) by other people suit to their purpose. > > > Typically the print out from the loop looks like this: > > > ATCGATCGATCGGGGGGGGGGGGGGGTTTGCGGG 10 11.992 > > CCCCCCCCGGGCCATCGGTCAGGGAATTGACGGAA 2 0.222 > > ..... > > up to ~16 million lines. > > >> How much physical memory do you have on your machine? > > 6GB > > >> Is there paging occuring due to the size of the objects? > > Don't quite understand what do you mean by that > > So sorry for my lack of knowledge in R. > > >> Have you consider creating a structure with 10,000 of the variables > >> each time through the loop and then writing them out? > > > Never thought about that. Can you be specific how can this be achieved? > > > -GundalaViswanath > > Jakarta - Indonesia > > > On Thu, Jan 8, 2009 at 10:10 PM, jim holtman <jholt...@gmail.com> wrote: > >> What exactly is the problem you are trying to solve. What is going to > >> be done with the data? Is it going to be read by some other program? > >> How much physical memory do you have on your machine? Is there paging > >> occuring due to the size of the objects? Have you consider creating a > >> structure with 10,000 of the variables each time through the loop and > >> then writing them out? A lot will depend on how much free memory you > >> have. I will also ask one of my favorite questions; "tell me what you > >> want to do, not how you want to do it". > > >> On Thu, Jan 8, 2009 at 6:12 AM,GundalaViswanath <gunda...@gmail.com> wrote: > >>> Dear all, > > >>> I found that printing with 'cat' is very slow. > > >>> For example in my machine this snippet > > >>> __BEGIN__ > > >>> # I need to resolve to use this type of loop. > >>> # because using write(), I need to create a matrix which > >>> # consumes so much memory. Note that "foo, bar, qux" object > >>> # is already very large (>2Gb) > > >>> for ( s in 1:length(x) ) { > >>> cat(as.character(foo[s]),"\t",bar[s],"\t", qux[s],"\n") > >>> } > >>> __END__ > > >>> for "x" of size ~1.5million, takes more than 10 hours to print. > >>> On my Linux 1994.MHz AMD processor. > > >>> Is there any faster alternatives to "cat" ? > > >>> -GundalaViswanath > >>> Jakarta - Indonesia > > >>> ______________________________________________ > >>> r-h...@r-project.org mailing list > >>>https://stat.ethz.ch/mailman/listinfo/r-help > >>> PLEASE do read the posting > >>> guidehttp://www.R-project.org/posting-guide.html > >>> and provide commented, minimal, self-contained, reproducible code. > > >> -- > >> Jim Holtman > >> Cincinnati, OH > >> +1 513 646 9390 > > >> What is the problem that you are trying to solve? > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > > ______________________________________________ > r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.