When you have an unbalanced quote, it may be hard to determine exactly where it is. It is probably up to the user to determine with there is truncation. In some cases you might have data that goes over several lines that are within quotes and is legal. You might also read up on the 'fill' and 'flush' parameters that take care of some other conditions. the 'read.table' functions assume that the data format is well formed; if you have concerns about your data, then some preprocessing might be in order. You can do this with external programs like 'perl' or with R by using readLines to read in the data and look for potential problems.
On Thu, Aug 25, 2011 at 12:19 PM, zhenjiang xu <zhenjiang...@gmail.com> wrote: > Thanks, Jim. quote='' works. And then I found a single quote in each of > these lines: > 3262 > 10403 > 17544 > 24685 > 31826 > 38967 > None of them near the position the table got truncated. Why is it? > And read.table is a great function. Is it possible for it to give a warning > message when the data gets truncated? In my case I almost looked over the > truncation... > On Thu, Aug 25, 2011 at 11:57 AM, jim holtman <jholt...@gmail.com> wrote: >> >> But did you try the following: >> >> x <- read.table(...., comment.char = '', quote = '') >> >> Most cases is that there is a missing quote somewhere in your data. >> use a text editor and search for single and double quotes. >> >> On Thu, Aug 25, 2011 at 11:49 AM, zhenjiang xu <zhenjiang...@gmail.com> >> wrote: >> > Thanks for your replies. I looked at those lines and didn't spot >> > anything >> > unusual. >> > >> >> tail(a) >> > test_id gene_id gene locus sample_1 sample_2 status >> > 21418 tY(GUA)J1 - SUP7 chr10:354243-354332 air1rrp6 air2rrp6 >> > OK >> > 21419 tY(GUA)J2 - SUP4 chr10:542955-543044 air1rrp6 air2rrp6 >> > OK >> > 21420 tY(GUA)M1 - SUP5 chr13:168794-168883 air1rrp6 air2rrp6 >> > OK >> > 21421 tY(GUA)M2 - SUP8 chr13:837927-838016 air1rrp6 air2rrp6 >> > OK >> > 21422 tY(GUA)O - SUP3 chr15:288191-288280 air1rrp6 air2rrp6 >> > OK >> > 21423 tY(GUA)Q - - chrmt:70823-70907 air1rrp6 air2rrp6 >> > OK >> > value_1 value_2 ln.fold_change. test_stat p_value q_value >> > significant >> > 21418 0.00000 0.0000 0.000000 0.00000 1.000000 1.011650 >> > no >> > 21419 0.00000 0.0000 0.000000 0.00000 1.000000 1.011480 >> > no >> > 21420 0.00000 0.0000 0.000000 0.00000 1.000000 1.011500 >> > no >> > 21421 0.00000 0.0000 0.000000 0.00000 1.000000 1.011520 >> > no >> > 21422 0.00000 0.0000 0.000000 0.00000 1.000000 1.011550 >> > no >> > 21423 6.68356 10.7397 0.474301 -1.08614 0.277417 0.455917 >> > no >> > >> > >> > tY(GUA)J1 - SUP7 chr10:354243-354332 rrp6 air1rrp6 >> > OK 0 0 0 0 1 1.00404 no >> > tY(GUA)J2 - SUP4 chr10:542955-543044 rrp6 air1rrp6 >> > OK 0 0 0 0 1 1.00497 no >> > tY(GUA)M1 - SUP5 chr13:168794-168883 rrp6 air1rrp6 >> > OK 0 0 0 0 1 1.00492 no >> > tY(GUA)M2 - SUP8 chr13:837927-838016 rrp6 air1rrp6 >> > OK 0 0 0 0 1 1.00488 no >> > tY(GUA)O - SUP3 chr15:288191-288280 rrp6 air1rrp6 >> > OK 0 0 0 0 1 1.00485 no >> > tY(GUA)Q - - chrmt:70823-70907 rrp6 air1rrp6 >> > OK 4.49644 6.68356 0.396365 -0.766052 0.443645 >> > 0.634724 no >> > 15S_rRNA - 15S_RRNA chrmt:6545-8194 WT air2rrp6 >> > OK 2288.88 711.697 -1.16817 2.78772 0.00530801 >> > 0.0167772 yes >> > 21S_rRNA - 21S_RRNA chrmt:58008-62447 WT >> > air2rrp6 OK 4134.59 1927.04 -0.7634 1.58991 0.111855 >> > 0.22339 no >> > ETS1-1 - ETS1-1 chr12:457732-458432 WT air2rrp6 >> > OK >> > 3258.97 1114.76 -1.07277 2.91211 0.00359 0.0121587 >> > yes >> > ETS1-2 - ETS1-2 chr12:466869-467569 WT air2rrp6 >> > OK >> > 3258.97 1114.76 -1.07277 2.91211 0.00359 0.0121597 >> > yes >> > >> > >> > On Wed, Aug 24, 2011 at 2:34 PM, Sarah Goslee >> > <sarah.gos...@gmail.com>wrote: >> > >> >> Hi, >> >> >> >> On Wed, Aug 24, 2011 at 2:18 PM, zhenjiang xu <zhenjiang...@gmail.com> >> >> wrote: >> >> > Hi R users, >> >> > >> >> > I was using read.table to read a file. The data.fame looked alright, >> >> > but >> >> I >> >> > found not all rows are read by the read.table. What's wrong with it? >> >> > It >> >> > didn't give me any warning or error messages. Why the data are >> >> > truncated? >> >> > Thanks. >> >> > >> >> > $ wc -l all/isoform_exp.diff >> >> > 42847 all/isoform_exp.diff >> >> > >> >> >> a=read.table('all/isoform_exp.diff', header=T, sep='\t') >> >> >> nrow(a) >> >> > [1] 21423 >> >> >> >> This is a common problem. You need to take a look at the last row that >> >> was imported, and the rows around 21423 in the original file. >> >> >> >> Common causes include stray single or double quotation marks, and >> >> other special characters in your file like the default comment.char # >> >> >> >> Sarah >> >> -- >> >> Sarah Goslee >> >> http://www.functionaldiversity.org >> >> >> > >> > >> > >> > -- >> > Best, >> > Zhenjiang >> > >> > [[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help@r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> >> -- >> Jim Holtman >> Data Munger Guru >> >> What is the problem that you are trying to solve? > > > > -- > Best, > Zhenjiang > -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.