On May 4, 2012, at 1:34 AM, iliketurtles wrote:

Dear Experienced R Practitioners,

I have 4GB .txt data called "dataset.txt" and have attempted to use *ff,
bigmemory, filehash and sqldf *packages to import it, but have had no
success. The readLines output of this data is:


Ther alignment of that output makes me wonder if the file is tab- speparated. You have considered the possibility that tab is the separator but have you actually tried using sep = "\t" in your read operations?

--
David.
readLines("dataset.txt",n=20)
[1] " "
[2] "
"
[3] " "
[4] "  PERMNO          DATE    SHRCD    COMNAM
PRC           VOL"
[5] ""
[6] "   10001    01/09/1986     11      GREAT FALLS GAS CO
-5.75000         14160"
[7] "   10001    01/10/1986     11      GREAT FALLS GAS CO
-5.87500             0"
[8] "   10001    01/13/1986     11      GREAT FALLS GAS CO
-5.87500          2805"
[9] "   10001    01/14/1986     11      GREAT FALLS GAS CO
[20] "   10001    01/29/1986     11      GREAT FALLS GAS CO
-6.06250          4600"

This data goes on for a huge number of rows (not sure exactly how many). Each element in each row is separated by and uneven number of (what seem to be) spaces (maybe TAB? not sure). Further, there are some rows that are
"incomplete", i.e. there's missing elements.

Take the first 29 rows of "dataset.txt" into a separate data file, let's call it "dataset2.txt". read.table("dataset2.txt",skip=5) gives the perfect table that I want to end up with, except I want it with the 4GB data through
bigmemory, ff or filehash.

snipped several failed attempts

NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA  NA

#Even worse.
###/*MY ATTEMPT USING sqldf*/###
No idea what to do here.

-----

David Winsemius, MD

West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to