That worked!
download.file(myurl, destfile=myfile, mode="wb")
Thanks a lot,
paolo
On 01/21/2011 02:53 PM, William Dunlap wrote:
Try mode="wb" ('b' for binary mode) in the
call to download.file(). It should make a
difference on Windows (& Mac?) and be innocuous on
Unix.
Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com
-----Original Message-----
From: r-help-boun...@r-project.org
[mailto:r-help-boun...@r-project.org] On Behalf Of Paolo Innocenti
Sent: Thursday, January 20, 2011 4:39 PM
To: r-help@r-project.org
Subject: [R] Reading gz compressed csv file - 'incomplete line found'
Hi all,
I am trying to download, decompress and read a csv file. My code:
myurl<-
"ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE2
4729/GSE24729_MitoNuclear_suppl_male_stats.csv.gz"
#
myfile<- "GSE24729_MitoNuclear_suppl_male_stats.csv.gz"
#
download.file(myurl, destfile=myfile, mode="w")
#
mycon<- gzcon(gzfile(myfile, open="r"))
#
mydata<- read.csv(textConnection(readLines(mycon)))
#
close(mycon)
works under my linux distribution, but under windows, I get the
following warning:
> myurl<-
"ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE2
4729/GSE24729_MitoNuclear_suppl_male_stats.csv.gz"
> myfile<- "GSE24729_MitoNuclear_suppl_male_stats.csv.gz"
> download.file(myurl, destfile=myfile, mode="w")
trying URL
'ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE2
4729/GSE24729_MitoNuclear_suppl_male_stats.csv.gz'
ftp data connection made, file length 535641 bytes
opened URL
downloaded 523 Kb
> mycon<- gzcon(gzfile(myfile, open="r"))
> mydata<- read.csv(textConnection(readLines(mycon)))
Warning message:
In readLines(mycon) :
incomplete final line found on
'gzcon(GSE24729_MitoNuclear_suppl_male_stats.csv.gz)'
> close(mycon)
I can read only 30 lines, and then stops working. Does anyone
have any
suggestion? I suspect the problem lies in gzcon/gzfile not
decompressing
properly, or in some other problem with the end of line/end
of file, but
the help files are a bit above my level of understanding.
Thanks,
paolo
> sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: i386-pc-mingw32/i386 (32-bit)
locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252
attached base packages:
[1] grid stats graphics grDevices utils
datasets methods
[8] base
other attached packages:
[1] lattice_0.19-13 drosophila2.db_2.4.5 org.Dm.eg.db_2.4.6
[4] GOstats_2.16.0 RSQLite_0.9-4 DBI_0.2-5
[7] graph_1.28.0 Category_2.16.0 AnnotationDbi_1.12.0
[10] xtable_1.5-6 GEOquery_2.16.3 ellipse_0.3-5
[13] RColorBrewer_1.0-2 hopach_2.10.0 cluster_1.13.2
[16] limma_3.6.9 genefilter_1.32.0 vsn_3.18.0
[19] affy_1.28.0 Biobase_2.10.0
loaded via a namespace (and not attached):
[1] affyio_1.18.0 annotate_1.28.0 GO.db_2.4.5
[4] GSEABase_1.12.2 preprocessCore_1.12.0 RBGL_1.26.0
[7] RCurl_1.5-0.1 splines_2.12.1 survival_2.36-2
[10] tools_2.12.1 XML_3.2-0.2
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.