Hi Stanley.
CHD850 wrote:
Hi everyone,
I have to fetch about 300 to 500 zipped archives from a remote ftp server.
Each of the archive is about 1Mb. I know I can get it done by using
download.file() in R, but I am curious that is there a faster way to do this
using RCurl. For example, are there some parameters that I can set so that
the connection does not need to be rebuilt....etc.
Yes, curl can keep connections alive. One can create a curl handle with
h = getCurlHandle()
and then use this in subsequent, related calls, e.g.
getURLContent("ftp://....", curl = curl)
Keeping the connection alive is more common in HTTP and can be done
explicitly by specifying
Connection = "Keep-Alive"
as one of the values for httpheader. But this is for HTTP. For FTP,
I'd have to look up the relevant curl options.
In addition to using a single handle across multiple calls, one
can use the multi-curl interface within RCurl which allows one
to make many asynchronous requests and process them as they reply.
This can often be be faster than the same number of requests done
sequentially.
A even simpler question is, how can I fetch an archive from the server and
place it somewhere locally? I have spent a lot of time reading RCurl
documents and curl web pages but in vain. Can someone show me an example of
the syntax? Pardon me if this is trivial to you.
I would use something like
content = getURLContent("ftp://...../foo.zip")
attributes(content) = NULL
writeBin(content, "/tmp/foo.zip")
and that should be sufficient.
(You have to strip the attributes or writeBin() complains.)
Thanks
Stanley
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.