What probably is the problem is that read.table.ffdf uses the nrows argument to read the file in chunks. However, read.fwf doesn't use a nrow argument but a n argument.

One (non tested) solution is to write a wrapper around read.fwf and pass this wrapper to read.table.ffwf. Something like:

my.read.fwf <- function(file, nrow=-1, ...) {
   read.fwf(file=file, n=nrow, ...)
}

Perhaps you'll also need to wrap some additional arguments.


read.fwf is terribly slow for large fixed width files. I would advise to use the LaF package in combination with the laf_to_ffwf function from the ffbase package. ... Although judging from your other question you already looked at that.

HTH,
Jan



On 08/06/2013 10:47 AM, christian.kame...@astra.admin.ch wrote:
Dear all

I am working on Windows 7 32-bit, and the ff- package is my daily life-saver to 
overcome the inherent memory limitations. Recently, I tried using 
read.table.ffdf to import data from a fixed-width ASCII file (file size: 
1'440'865'015 Bytes) with 6'079'455 lines and 32 variables using the command
read.table.ffdf(file=my.filename, FUN="read.fwf", width=my.format, 
asffdf_args=list(col_args=list(pattern = my.pattern))

The command generates a temporary file, which has 1'629'328'120 Bytes, plus 32 
ff files following my.pattern. The latter 32 files, however, only take up 
136'000 Bytes. And the resulting R object has a dimension of 1000 x 32. To me, 
it seems that read.table.ffdf aborts the data import after 1000 lines, instead 
of importing the entire file.

I tried running read.table.ffdf with different parameter settings, I was 
browsing the help pages and the mailing lists, but I did not find any hint on 
why read.table.ffdf aborts the data import. (Does it really? - The file size of 
the temporary file suggests that all data were read.)

Any help would be highly appreciated

Best Regard

Christian Kamenik
Project Manager

Federal Department of the Environment, Transport, Energy and Communications 
DETEC
Federal Roads Office FEDRO
Division Road Traffic
Road Accident Statistics

Mailing Address: 3003 Bern
Location: Weltpoststrasse 5, 3015 Bern

Tel +41 31 323 14 89
Fax +41 31 323 43 21

christian.kame...@astra.admin.ch<mailto:christian.kame...@astra.admin.ch>
www.astra.admin.ch<http://www.astra.admin.ch/>


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to