Here's a stack overflow question addressing the same issue.
http://stackoverflow.com/a/22261345
Hopefully it will help.
Thanks
> Date: Wed, 23 Jul 2014 12:33:11 -0300
> From: khurram.na...@gmail.com
> To: r-help@r-project.org
> Subject: [R] Importing random subsets of a data fil
It is great to see so many nice resources available. Thanks for the
suggestions and directing me to useful solutions. Using the 'awk' code
within R seems very promising for my problem. Also, I am looking into
reading the random samples from SQLite database as indicated by Greg. As my
algorithm runs
For speed your best choice is probably to load your data into a
database, then pull your samples from the database. A simple database
is SQLite and there are R packages that work directly with that
database.
Can the later samples contain some of the same rows as previous
samples? Or once a row i
I think an external program like awk (or gawk) would be better. You can call it
with the R system() function if needed.
http://stackoverflow.com/questions/7514896/select-random-3000-lines-from-a-file-with-awk-codes
You might want to sample once and then break into sequential subsets rather
than
Hi,
You can use scan() with the nlines and skip arguments to read in a
single line from anywhere in a file.
Sarah
On Wed, Jul 23, 2014 at 11:33 AM, Khurram Nadeem
wrote:
> Hi R folks,
>
> Here is my problem.
>
> *1.* I have a large data file (say, in .csv or .txt format) containing 1
> million
Hi R folks,
Here is my problem.
*1.* I have a large data file (say, in .csv or .txt format) containing 1
million rows and 500 variables (columns).
*2.* My statistical algorithm does not require the entire dataset but just
a small random sample from the original 1 million rows.
*3. *This algorit
6 matches
Mail list logo