On Feb 24, 2012, at 06:58 , Sam Steingold wrote: > Hi, > > batch is a vector of lines returned by readLines from a > NL-line-terminated file, here is the relevant section: > ========================================================= > AA BB CC DD EE FF > GG H > > H JJ KK LL MM > ========================================================= > as you can see, a line is corrupt; two CRLF's are inserted.
Actually, I don't see... (It's pretty hard to count TAB characters by eye.) > This is okay, I drop the bad lines, at least I hope I do: > > conn <- textConnection(batch) > field.counts <- count.fields(conn, sep="\t", comment.char="", quote="") > close(conn) > good <- field.counts == 8 # this should drop all bad lines > if (!all(good)) > batch <- batch[good] > conn <- textConnection(batch) > ret <- read.table(conn, sep="\t", comment.char="", quote="") > close(conn) > > I get this error in read.table(): > > Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : > line 7151 did not have 8 elements > > how come?! You can do better than this in terms of providing clues for us: "batch" is a character vector, right? So recheck that count.fields returns all 8's after removal of bad lines. Also check that dimensions match -- is length(batch) actually the same as length(field.counts)? Finally, what is in line 7151? > > also, is there some error recovery? Well you can try(). > e.g., the code above is a part of a function - is there a way to recover > batch (without re-running the whole thing)? > > Thanks! > > -- > Sam Steingold (http://sds.podval.org/) on Ubuntu 11.10 (oneiric) X > 11.0.11004000 > http://www.childpsy.net/ http://openvotingconsortium.org http://iris.org.il > http://www.PetitionOnline.com/tap12009/ http://dhimmi.com > Conscience is like a hamster: it is either asleep or gnawing. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Email: pd....@cbs.dk Priv: pda...@gmail.com ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.