Re: [R] Reading large files with R

2019-09-01 Thread Martin Møller Skarbiniks Pedersen
On Sun, 1 Sep 2019 at 21:53, Duncan Murdoch wrote: > On 01/09/2019 3:06 p.m., Martin Møller Skarbiniks Pedersen wrote: > > Hi, > > > >I am trying to read yaml-file which is not so large (7 GB) and I have > > plenty of memory. > > > Individual elements in character vectors have a size limit o

Re: [R] Reading large files with R

2019-09-01 Thread Duncan Murdoch
On 01/09/2019 3:06 p.m., Martin Møller Skarbiniks Pedersen wrote: Hi, I am trying to read yaml-file which is not so large (7 GB) and I have plenty of memory. However I get this error: $ R --version R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for

[R] Reading large files with R

2019-09-01 Thread Martin Møller Skarbiniks Pedersen
Hi, I am trying to read yaml-file which is not so large (7 GB) and I have plenty of memory. However I get this error: $ R --version R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) library

Re: [R] Reading large files

2010-02-06 Thread Saptarshi Guha
Hello, Do you need /all/ the data in memory at one time? Is your goal to divide the data (e.g according to some factor /or/ some function of the columns of data set ) and then analyze the divisions? And then, possibly, combine the results ? If so, you might consider using Rhipe. We have analyzed (e

Re: [R] Reading large files

2010-02-06 Thread Gabor Grothendieck
t, then the >> filename should be ignored. Is this how it works? >> >> Thanks. >> Satish >> >> >> -Original Message- >> From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] >> Sent: Saturday, February 06, 2010 4:58 PM >> To:

Re: [R] Reading large files

2010-02-06 Thread Gabor Grothendieck
inal Message- > From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] > Sent: Saturday, February 06, 2010 4:58 PM > To: Vadlamani, Satish {FLNA} > Cc: r-help@r-project.org > Subject: Re: [R] Reading large files > > I have uploaded another version which suppresses disp

Re: [R] Reading large files

2010-02-06 Thread Vadlamani, Satish {FLNA}
bruary 06, 2010 4:58 PM To: Vadlamani, Satish {FLNA} Cc: r-help@r-project.org Subject: Re: [R] Reading large files I have uploaded another version which suppresses display of the error message but otherwise works the same. Omitting the redundant arguments we have: ibrary(sqldf) # next line

Re: [R] Reading large files

2010-02-06 Thread Gabor Grothendieck
tfcst_small.dat) > 3: closing unused connection 3 (3wkoutstatfcst_small.dat) >> test_df >   allgeo area1 zone dist ccust1 whse bindc ccust2 account area2 ccust3 > 1       A     4    1   37     99 4925  4925     99      99     4     99 > 2       A     4    1   37     99 4925  4925     9

Re: [R] Reading large files

2010-02-06 Thread Vadlamani, Satish {FLNA}
Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Saturday, February 06, 2010 4:28 PM To: Vadlamani, Satish {FLNA} Cc: r-help@r-project.org Subject: Re: [R] Reading large files The software attempts to read the registry and temporarily augment the path in case you have R

Re: [R] Reading large files

2010-02-06 Thread Gabor Grothendieck
t;)) >   user  system elapsed >  192.53   15.50  213.68 > Warning message: > closing unused connection 3 (out.txt) > > Thanks again. > > Satish > > -Original Message- > From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] > Sent: Saturday, February 06, 2

Re: [R] Reading large files

2010-02-06 Thread Vadlamani, Satish {FLNA}
tish -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Saturday, February 06, 2010 3:02 PM To: Vadlamani, Satish {FLNA} Cc: r-help@r-project.org Subject: Re: [R] Reading large files Note that you can shorten #1 to read.csv.sql("out.txt") since your other

Re: [R] Reading large files

2010-02-06 Thread Gabor Grothendieck
df <- read.csv2.sql(file="3wkoutstatfcst_small.dat", sql = "select * >> from file", header = TRUE, sep = ",", filter="perl parse_3wkout.pl", dbname >> = tempfile()) > Error in readRegistry(key, maxdepth = 3) : >  Registry key 'SOFTWARE\R-c

Re: [R] Reading large files

2010-02-06 Thread Vadlamani, Satish {FLNA}
WARE\R-core' not found -----Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Saturday, February 06, 2010 12:14 PM To: Vadlamani, Satish {FLNA} Cc: r-help@r-project.org Subject: Re: [R] Reading large files No. On Sat, Feb 6, 2010 at 1:01 PM, Vadlamani,

Re: [R] Reading large files

2010-02-06 Thread Gabor Grothendieck
ry 06, 2010 9:41 AM > To: Vadlamani, Satish {FLNA} > Cc: r-help@r-project.org > Subject: Re: [R] Reading large files > > Its just any Windows batch command string that filters stdin to > stdout.  What the command consists of should not be important.   An > invocation of perl that runs

Re: [R] Reading large files

2010-02-06 Thread Vadlamani, Satish {FLNA}
Gabor: Can I pass colClasses as a vector to read.csv.sql? Thanks. Satish -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Saturday, February 06, 2010 9:41 AM To: Vadlamani, Satish {FLNA} Cc: r-help@r-project.org Subject: Re: [R] Reading large files

Re: [R] Reading large files

2010-02-06 Thread Gabor Grothendieck
Message- > From: jim holtman [mailto:jholt...@gmail.com] > Sent: Saturday, February 06, 2010 6:16 AM > To: Gabor Grothendieck > Cc: Vadlamani, Satish {FLNA}; r-help@r-project.org > Subject: Re: [R] Reading large files > > In perl the 'unpack' command makes it very easy to

Re: [R] Reading large files

2010-02-06 Thread Vadlamani, Satish {FLNA}
again. Saitsh -Original Message- From: jim holtman [mailto:jholt...@gmail.com] Sent: Saturday, February 06, 2010 6:16 AM To: Gabor Grothendieck Cc: Vadlamani, Satish {FLNA}; r-help@r-project.org Subject: Re: [R] Reading large files In perl the 'unpack' command makes it very eas

Re: [R] Reading large files

2010-02-06 Thread jim holtman
mailto:ggrothendi...@gmail.com] >> Sent: Friday, February 05, 2010 5:16 PM >> To: Vadlamani, Satish {FLNA} >> Cc: r-help@r-project.org >> Subject: Re: [R] Reading large files >> >> If your problem is just how long it takes to load the file into R try >> read.cs

Re: [R] Reading large files

2010-02-05 Thread Gabor Grothendieck
-Original Message- > From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] > Sent: Friday, February 05, 2010 5:16 PM > To: Vadlamani, Satish {FLNA} > Cc: r-help@r-project.org > Subject: Re: [R] Reading large files > > If your problem is just how long it takes to load the f

Re: [R] Reading large files

2010-02-05 Thread Vadlamani, Satish {FLNA}
know your thoughts on the approach? Satish -Original Message- From: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] Sent: Friday, February 05, 2010 5:16 PM To: Vadlamani, Satish {FLNA} Cc: r-help@r-project.org Subject: Re: [R] Reading large files If your problem is just how long it

Re: [R] Reading large files

2010-02-05 Thread Gabor Grothendieck
If your problem is just how long it takes to load the file into R try read.csv.sql in the sqldf package. A single read.csv.sql call can create an SQLite database and table layout for you, read the file into the database (without going through R so R can't slow this down), extract all or a portion

Re: [R] Reading large files

2010-02-05 Thread jim holtman
What you need to do is to take a smaller sample of you data (e.g. 50-100MB) and load that data and determine how big the resulting object is. Depends a lot on how you are loading it. Are you using 'scan' or 'read.table'; if 'read.table' have you define the class of the columns? I typically read i

Re: [R] Reading large files

2010-02-05 Thread Charlie Sharpsteen
On Thu, Feb 4, 2010 at 5:27 PM, Vadlamani, Satish {FLNA} wrote: > Folks: > I am trying to read in a large file. Definition of large is: > Number of lines: 333, 250 > Size: 850 MB Perhaps this post by JD Long will provide an example that is suitable to your situation: http://www.cerebralmasticat

Re: [R] Reading large files

2010-02-05 Thread Matthew Dowle
I can't help you further than whats already been posted to you. Maybe someone else can. Best of luck. "Satish Vadlamani" wrote in message news:1265397089104-1470667.p...@n4.nabble.com... > > Matthew: > If it is going to help, here is the explanation. I have an end state in > mind. It is given b

Re: [R] Reading large files

2010-02-05 Thread Satish Vadlamani
Matthew: If it is going to help, here is the explanation. I have an end state in mind. It is given below under "End State" header. In order to get there, I need to start somewhere right? I started with a 850 MB file and could not load in what I think is reasonable time (I waited for an hour). The

Re: [R] Reading large files

2010-02-05 Thread Matthew Dowle
I agree with Jim. The term "do analysis" is almost meaningless, the posting guide makes reference to statements such as that. At least he tried to define large, but inconsistenly (first of all 850MB, then changed to 10-20-15GB). > Satish wrote: "at one time I will need to load say 15GB into R"

Re: [R] Reading large files

2010-02-05 Thread jim holtman
Where should be shine it? No information provided on operating system, version, memory, size of files, what you want to do with them, etc. Lot of options: put it in a database, read partial file (lines and/or columns), preprocess, etc. Your option. On Fri, Feb 5, 2010 at 8:03 AM, Satish Vadlama

Re: [R] Reading large files

2010-02-05 Thread Satish Vadlamani
Folks: Can anyone throw some light on this? Thanks. Satish - Satish Vadlamani -- View this message in context: http://n4.nabble.com/Reading-large-files-tp1469691p1470169.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-p

Re: [R] Reading large files

2010-02-04 Thread satishv
Folks: Suppose I divide USA into 16 regions. My end goal is to run data mining / analysis on each of these 16 regions. The data for each of these regions (sales, forecast, etc.) will be in the range of 10-20 GB. At one time, I will need to load say 15 GB into R and then do analysis. Is this somet

[R] Reading large files

2010-02-04 Thread Vadlamani, Satish {FLNA}
Folks: I am trying to read in a large file. Definition of large is: Number of lines: 333, 250 Size: 850 MB The maching is a dual core intel, with 4 GB RAM and nothing else running on it. I read the previous threads on read.fwf and did not see any conclusive statements on how to read fast. Exampl

Re: [R] Reading large files quickly; resolved

2009-05-11 Thread Rob Steele
Rob Steele wrote: > I'm finding that readLines() and read.fwf() take nearly two hours to > work through a 3.5 GB file, even when reading in large (100 MB) chunks. > The unix command wc by contrast processes the same file in three > minutes. Is there a faster way to read files in R? > > Thanks! >

Re: [R] Reading large files quickly

2009-05-10 Thread Rob Steele
At the moment I'm just reading the large file to see how fast it goes. Eventually, if I can get the read time down, I'll write out a processed version. Thanks for suggesting scan(); I'll try it. Rob jim holtman wrote: > Since you are reading it in chunks, I assume that you are writing out each >

Re: [R] Reading large files quickly

2009-05-09 Thread jim holtman
Since you are reading it in chunks, I assume that you are writing out each segment as you read it in. How are you writing it out to save it? Is the time you are quoting both the reading and the writing? If so, can you break down the differences in what these operations are taking? How do you pl

Re: [R] Reading large files quickly

2009-05-09 Thread Rob Steele
Thanks guys, good suggestions. To clarify, I'm running on a fast multi-core server with 16 GB RAM under 64 bit CentOS 5 and R 2.8.1. Paging shouldn't be an issue since I'm reading in chunks and not trying to store the whole file in memory at once. Thanks again. Rob Steele wrote: > I'm finding th

Re: [R] Reading large files quickly

2009-05-09 Thread Jakson Alves de Aquino
Rob Steele wrote: > I'm finding that readLines() and read.fwf() take nearly two hours to > work through a 3.5 GB file, even when reading in large (100 MB) chunks. > The unix command wc by contrast processes the same file in three > minutes. Is there a faster way to read files in R? I use statist

Re: [R] Reading large files quickly

2009-05-09 Thread jim holtman
First 'wc' and readLines are doing vastly different functions. 'wc' is just reading through the file without having to allocate memory to it; 'readLines' is actually storing the data in memory. I have a 150MB file I was trying it on, and here is what 'wc' did on my Windows system: /cygdrive/c: t

Re: [R] Reading large files quickly

2009-05-09 Thread Gabor Grothendieck
You could try it with sqldf and see if that is any faster. It use RSQLite/sqlite to read the data into a database without going through R and from there it reads all or a portion as specified into R. It requires two lines of code of the form: f < file("myfile.dat") DF <- sqldf("select * from f",

[R] Reading large files quickly

2009-05-09 Thread Rob Steele
I'm finding that readLines() and read.fwf() take nearly two hours to work through a 3.5 GB file, even when reading in large (100 MB) chunks. The unix command wc by contrast processes the same file in three minutes. Is there a faster way to read files in R? Thanks! __