Hi Jim, Thanks for responding. Here is the info I should have included before. I should be able to access 4 GB.
> str(myData) 'data.frame': 53860857 obs. of 4 variables: $ V1: chr "200003" "200006" "200047" "200050" ... $ V2: chr "cv0001" "cv0001" "cv0001" "cv0001" ... $ V3: chr "A" "A" "A" "B" ... $ V4: chr "B" "B" "A" "B" ... > sessionInfo() R version 2.11.0 (2010-04-22) x86_64-unknown-linux-gnu locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=C LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base On Mon, Jul 12, 2010 at 7:54 AM, jim holtman <jholt...@gmail.com> wrote: > What is the configuration you are running on (OS, memory, etc.)? What > does your object consist of? Is it numeric, factors, etc.? Provide a > 'str' of it. If it is numeric, then the size of the object is > probably about 1.8GB. Doing the long to wide you will probably need > at least that much additional memory to hold the copy, if not more. > This would be impossible on a 32-bit version of R. > > On Mon, Jul 12, 2010 at 1:25 AM, Juliet Hannah <juliet.han...@gmail.com> > wrote: >> I have a data set that has 4 columns and 53860858 rows. I was able to >> read this into R with: >> >> cc <- rep("character",4) >> myData <- >> read.table("myData.csv",header=FALSE,skip=1,colClasses=cc,nrow=53860858,sep=",") >> >> >> I need to reshape this data from long to wide. On a small data set the >> following lines work. But on the real data set, it didn't finish even >> when I took a sample of two (rows in new data). I didn't receive an >> error. I just stopped it because it was taking too long. Any >> suggestions for improvements? Thanks. >> >> # start example >> # i have commented out the write.table statement below >> >> testData <- read.table(textConnection("rs9999853,cv0084,A,A >> rs999986,cv0084,C,B >> rs9999883,cv0084,E,F >> rs9999853,cv0085,G,H >> rs999986,cv0085,I,J >> rs9999883,cv0085,K,L"),header=FALSE,sep=",") >> closeAllConnections() >> >> mysamples <- unique(testData$V2) >> >> for (one_ind in mysamples) { >> one_sample <- testData[testData$V2==one_ind,] >> mywide <- reshape(one_sample, timevar = "V1", idvar = >> "V2",direction = "wide") >> # write.table(mywide,file >> ="newdata.txt",append=TRUE,row.names=FALSE,col.names=FALSE,quote=FALSE) >> } >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > > > > -- > Jim Holtman > Cincinnati, OH > +1 513 646 9390 > > What is the problem that you are trying to solve? > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.