Perhaps you could contact the persons that supplied/created the file and
ask them what the format of the file exactly is. That is probably the
safest thing to do.
If you are sure that the lines containing only whitespace are
meaningless, then you could alter the previous code to make a copy
Your code works!
strangelines.txt was created, and it's a text file with just spacebars ...
Seems like a few thousand lines of complete blanks (not 1 non-blank entry).
One thing, when I ran your code there was an error message;
> setwd("C:/Users/admin/Desktop/hons/Thesis")
> con <- file("dataset
OK, not all, but most lines have the same length. Perhaps you could
write the lines with a different line size to a separate file to have
a closer look at those lines. Modifying the previous code (again not
tested):
con <- file("dataset.txt", "rt")
out <- file("strangelines.txt", "wt")
#
Hi David,
I've tried using sep="\t" but it doesn't work, unfortunately.
Thanks for your help.
-
Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context:
http://r.789695.n4.nabble.com/Can-t-import-this-4GB-DATASET-tp4607862p4608936.html
Sent from the
Jan, thank you.
> table(line_sizes)
line_sizes
01 97 256
1430 2860 46869069 1430
-
Isaac
Research Assistant
Quantitative Finance Faculty, UTS
--
View this message in context:
http://r.789695.n4.nabble.com/Can-t-import-this-4GB-DATASET-tp4607862p4
On May 4, 2012, at 1:34 AM, iliketurtles wrote:
Dear Experienced R Practitioners,
I have 4GB .txt data called "dataset.txt" and have attempted to use
*ff,
bigmemory, filehash and sqldf *packages to import it, but have had no
success. The readLines output of this data is:
Ther alignment o
read.table imports the company name "GREAT FALLS GAS CO" as four
separate columns. I think that needs to be one column. I can imagine
that further one in your file you will have another company name that
does not consist of four words which would cause the error you
observed. From your ou
Dear Experienced R Practitioners,
I have 4GB .txt data called "dataset.txt" and have attempted to use *ff,
bigmemory, filehash and sqldf *packages to import it, but have had no
success. The readLines output of this data is:
readLines("dataset.txt",n=20)
[1] " "
8 matches
Mail list logo