In the solution below, what is the advantage of using "0L". M0 <- read.csv("M1.csv", nrows = 1)[0L, ]
Thanks! 2011/11/8 Gabor Grothendieck <ggrothendi...@gmail.com>: > 2011/11/8 Sergio René Araujo Enciso <araujo.enc...@gmail.com>: >> Dear all: >> >> I have two larges files with 2000 columns. For each file I am >> performing a loop to extract the "i"th element of each file and create >> a data frame with both "i"th elements in order to perform further >> analysis. I am not extracting all the "i"th elements but only certain >> which I am indicating on a vector called "d". >> >> See an example of my code below >> >> ### generate an example for the CSV files, the original files contain >> more than 2000 columns, here for the sake of simplicity they have only >> 10 columns >> M1<-matrix(rnorm(1000), nrow=100, ncol=10, >> dimnames=list(seq(1:100),letters[1:10])) >> M2<-matrix(rnorm(1000), nrow=100, ncol=10, >> dimnames=list(seq(1:100),letters[1:10])) >> write.table(M1, file="M1.csv", sep=",") >> write.table(M2, file="M2.csv", sep=",") >> >> ### the vector containing the "i" elements to be read >> d<-c(1,4,7,8) >> P1<-read.table("M1.csv", header=TRUE) >> P2<-read.table("M1.csv", header=TRUE) >> for (i in d) { >> M<-data.frame(P1[i],P2[i]) >> rm(list=setdiff(ls(),"d")) >> } >> >> As the files are quite large, I want to include "read.table" within >> the loop so as it only read the "i"th element. I know that there is >> the option "colClasses" for which I have to create a vector with zeros >> for all the columns I do not want to load. Nonetheless I have no idea >> how to make this vector to change in the loop, so as the only element >> with no zeros is the "i"th element following the vector "d". Any ideas >> how to do this? Or is there anz other approach to load only an >> specific element? >> > > Its a bit messy if there are row names so lets generate M1.csv like this: > > write.csv(M1, file = "M1.csv", row.names = FALSE) > > Then we can do this: > > nc <- ncol(read.csv("M1.csv", nrows = 1)) > colClasses <- replace(rep("NULL", nc), d, NA) > M1.subset <- read.csv("M1.csv", colClasses = colClasses) > > or using the same M1.csv that we just generated try this which uses > sqldf with the H2 backend: > > library(sqldf) > library(RH2) > > M0 <- read.csv("M1.csv", nrows = 1)[0L, ] > M1.subset.h2 <- sqldf(c("insert into M0 (select * from csvread('M1.csv'))", > "select a, d, g, h from M0")) > > This is referred to as Alternative 3 in FAQ#10 Example 6a on the sqldf > home page: > http://sqldf.googlecode.com > Alternative 1 and Alternative 2 listed there could also be tried. > > (Note that although sqldf has a read.csv.sql command we did not use it > here since that command only works with the sqlite back end and the > RSQLite driver has a max of 999 columns.) > > -- > Statistics & Software Consulting > GKX Group, GKX Associates Inc. > tel: 1-877-GKX-GROUP > email: ggrothendieck at gmail.com > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.