On Tue, Mar 29, 2011 at 06:58:59PM -0400, Dimitri Liakhovitski wrote: > I have a tab-delimited .txt file (size 800MB) with about 3.4 million > rows and 41 columns. About 15 columns contain strings. > Tried to read it in in R 2.12.2 on a laptop that has Windows XP: > mydata<-read.delim(file="FileName.TXT",sep="\t") > R did not complain (!) and I got: dim(mydata) 1692063 41.
My guess would be that there are (unexpected) quotes and/or double quotes in your file and so R thinks that rather large blocks of your file are actually very long strings. This routinely happens in situations like this: ID x description 1 0.4 my first measurement 2 1.6 Normal 5" object 3 0.4 Some measuremetn 4 0.7 A 4" long sample R thinks that the description in row 2 ends in row 4 and you loose data. Try read.delim(..., quote=""). cu Philipp -- Dr. Philipp Pagel Lehrstuhl für Genomorientierte Bioinformatik Technische Universität München Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.